Ivor Williams
I shall end with two unanswered questions. The reason for that lies in a story with eight decimal places of recondite mystery and scarcely believable deductions. One last glimpse of reality: the mean temperature of the world at the moment (early November) is hovering around 14 deg C, which is never used because it does not convey a sufficient element of danger in the global warming message. Fourteen degrees Celsius or fifty-seven Fahrenheit are not messages of imminent doom. Either one is the annual mean temperature of Bordeaux, San Francisco or Canberra.
Therefore the Wise Ones have decided that any global temperature given to the masses must always be shown as a difference from the mean of the half-century 1850-1900, which, they say, is representative of our world in smoke-free pre-industrial times. That period also happens to be towards the end of the Little Ice Age, which, the Met Office says, had ‘particularly cold intervals beginning in about 1650, 1770 and 1850.’ Cold spell beginning in 1850? Interesting.
Thus it was that on 10 January this year the Met Office told us that ‘The global average temperature for 2024 was 1.53±0.08°C above the 1850-1900 global average,’ This is an extraordinarily accurate figure but the World Meteorological Organisation has much the same: ‘The global average surface temperature [in 2024] was 1.55 °C … ± 0.13 °C … above the 1850-1900 average, according to WMO’s consolidated analysis.’ Ignore the scarcely believable accuracy of those second decimal places, there’s worse to come.
The obvious question is: Why were those fifty years chosen as the fundamental reference period? The answer is easily found: ‘Global-scale observations from the instrumental era began in the mid-19th century for temperature,’ says the Intergovernmental Panel on Climate Change (IPCC) in their Fifth Assessment Report (Section B, page 4.) An associated IPCC Special Report (FAQ1.2 para 4) explains that ‘The reference period 1850–1900 … is the earliest period with near-global observations and is … used as an approximation of pre-industrial temperature.’ Note the categoric statements that sufficient data is available in that nineteenth century fifty-year period to calculate the global mean temperatures.
In 1850, may I remind you, Dickens was writing David Copperfield, California was admitted to the Union as the 31st state and vast areas of the earth were still unexplored. 1900 brought the Boxer Rebellion (China), the Boer War (South Africa) and the Galveston hurricane (USA). There were still quite large areas awaiting intrepid explorers.
I was curious about how in olden times those global temperatures were actually measured, but after a painstaking search of websites and yet again proving that AI-derived information can be both wrong and misleading, I turned in despair to the Met Office enquiry desk. Their reply was long and very detailed. No actual data, but several clues as to where to search. Very interesting clues.
The IPCC report above claiming ‘global-scale observations’ is obviously true, because the World Meteorological Organisation has a comprehensive graph showing six different global mean temperature measurements of the difference from the 1850-1900 period. But a link ‘Get the data’ on the same page leads to the following curious table of the Met Office anomalies:
1850 -0.1797
1851 -0.0592
then every year to
1899 0.0128
1900 0.1218
then every year to
2023 1.4539
2024 1.5361
There is even more accurate Met Office data from the past, this time anomalies relative to the 1961-1990 period but this time totally unbelievable, all from HadCRUT5.1.0.0, Summary Series, Global, CSV file, Annual.
1850 -0.42648312
1851 -0.2635183
then every year to
1899 -0.34430692
1900 -0.2301605
then every year to
2024 1.1690052
Dig further and monthly values are produced. You can’t help being suspicious of even two decimal places, let alone eight. I dug deeper. I found graphs.
They show northern and southern hemispheres separately, with both station count and coverage percentage. They are from a paper: Hemispheric and large-scale land surface air temperature variations: An extensive revision and an update to 2010, P.D. Jones et al. Page 48, line 1120. They show the number of recording stations and the hemisphere percentage covered from 1850 to 2010.
Very similar pictures are also shown in Land Surface Air Temperature Variations Across the Globe Updated to 2019: The CRUTEM5 Data Set, T J Osborn et al, para 5.1 fig 6, and Hemispheric Surface Air Temperature … to 1993, P D Jones 1993, page 1797.


Approximate readings from the above graphs:
1850
Northern hemisphere coverage 7%
Southern hemisphere coverage 0-1%
1900
Northern hemisphere coverage 23%
Southern hemisphere coverage 6%
Surely not? There must be a mistake somewhere. But there’s nothing like a graph in scientific peer-reviewed papers for providing clear and unequivocal information. If you still think this just cannot be true, then look further at the American Meteorological Society map of station density 1861-1890 (Section 5 of Journal), or a classic Bartholomew map of world reporting stations in 1899.
The information supplied by the Met Office led me to a meandering pathway of scientific papers covering thirty-odd years of intensive research into the problem of accurately measuring global mean temperatures from 1850 onwards. The path seems to have ended in a swirling fog.
Those graphs show that even by 1900 only about 15% of the earth had recording stations. And the 1850 data is apparently extracted from only around 4%.
How can world temperatures be measured that accurately with such an impossibly small amount of data – almost nothing from the oceans and most of the rest from North America and Western Europe?
It wouldn’t really matter except for someone having decided that current global mean temperatures should always be shown to the worried world as anomalies compared with the 1850-1900 data, which is itself possibly a cooler climatic period. The intention must be to demonstrate clearly that there is no doubt that we are indeed warming up dangerously, and if we don’t do something about it soon it will be too late and don’t say we didn’t warn you.
But, and this is one huge ‘but’, how can the 1850 mean global temperature be recorded, for instance, as -0.1797 deg C less than the mean of 1850-1900, when it seems that reporting stations covered only about 4% of the earth at that time? And why to a totally unrealistic ten-thousandth of a degree?
I did warn you this would end with two unanswered questions, and here they are, both about that fifty-year 1850-1900 period:
Where can we consult the actual original global data?
How were those incredibly accurate anomaly figures calculated?
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Step aside my good, Ivor, nothing to see here. Believe the hype and join the movement.
<sarc>
I can feel the movement down deep in my bowels
“How were those incredibly accurate anomaly figures calculated?”
Particularly since the measuring tool accuracy was +/- 1°and all the tools were different entities.🤔🤷♂️
Regardless of the accuracy of the instrument, all measurements were recorded to the closest degree, as measured by yee olde Mark I eyeball.
I might be nit picking, but:
My eyesight is deteriorating. The ophthalmogist tells me it is perfectly normal, even for one of the finest of the species, at my age. So they sell me some ridiculously expensive glasses. I guess people my age (still working) would need glasses even in 1850. Can we assume Mark 0.1 eyeball and even question the +/-1?
Probably a bigger problem was if people of different heights read the thermometer and were unaware of the problem of parallax.
I would have liked that job, staying there all day and all night to catch that magical moment when the low and highs were hit.
I don’t have any charts on how it worked, but they had indicators that stored the daily high and low. They had to be reset after being read each day.
Also, these indicators are another source of potential error. Especially if they aren’t being properly maintained.
My parents used to have one. The little iron plugs atop the mercury column were re-set with a magnet.
Particularly as the weather reporting was
mostly done as a hobby by the local schoolteacher, Reverend, post master, or
gentleman dilletante. Enthusiasm, but not much rigor or training. And no real effort to calibrate the thermometers or procedure.
More like +/- 5 degrees.
The absolute error might have been that high because of low manufacturing tolerances and lack of calibration. However, the relative error for the same thermometer was probably less than +/- 2 degrees.
The problem is with the “global” part of mean temperature. It’s utter nonsense, no physical meaning.
From the second paragraph of the above article:
“. . . any global temperature given to the masses must always be shown as a difference from the mean of the half-century 1850-1900, which, they say, is representative of our world in smoke-free pre-industrial times.”
While there may be some truth to the claim about that interval being “pre-industrial times”, it is absolutely false to claim that temperature measurements taken over that period were “representative of our world”.
Some facts:
— The oldest permanently manned weather station on the continent of Antarctic was established in 1904
— There were no ships capable of monitoring weather (including air temperatures) crossing the Arctic during that time . . . the first exploration vessel to pass along the Northwest Passage (from Greenland to Alaska) was the Norwegian sloop Gjøa in 1906
— Data on the world’s ocean temperatures (and consequently the air temperature immediately above) was almost entirely establish by measuring bucket-sampled water temperatures for a globally-restricted set of marine ships traveling well-established routes . . . not at all a global sampling
— One of the first people to use weather balloons was the French meteorologist Léon Teisserenc de Bort. Starting in 1896 he launched hundreds of weather balloons from his observatory in Trappes, France.
— There was no use of airplanes prior to 1903, so no airports with temperature/weather monitoring stations
— There were no earth-orbiting satellites prior to 1957 . . . and instruments carried by orbiting satellites are the only practical means of obtaining globally-representative measurements of lower atmospheric temperatures.
Good grief!
The belief that pre-1850 was smoke free, is a claim so ludicrous, that only a climate alarmists could believe it.
Ahhh… Easy Peasy to get an answer to 8 decimal places: just buy a $15 calculator,
input any numbers you please, pick an operation, hit Enter, and Presto! you have a 8 digit, PRECISE number.
BUT, and it’s a big Kardashian style one, you really want an ACCURATE number.
That, my friends, is a horse of a different color!
The numbers are generated by models, than are used to validate the models which generated them? I thought that is the climate science way?
You get similar results dividing small primes by larger primes
If you have a fairly large PRIME number of small decimal figures added up and the result is Prime
31 stations of small decimal numbers (between 0.3 and 0.4) adding up to 11 will give you an average of 0.35483870967
Large primes have small primes upon their backs to bite ’em.
Small primes have even smaller primes and so ad infinitum.
Cute😘
Is the same also true for prime-mates?
My mate is most certainly prime.
That may depend on whether the “prime-mate” is the first-mate or second-mate on the ship of life.
Like it 🙂
All global mean or average temperatures are nonsense as temperature is always affected by pressure which is always changing all the time. Likewise wind effects, humidity and height above sea level of the measuring station are never taken into account and weather in all its forms is never the same on any given day so there can not be a mean or average.
All weather simply occurs between maximums and minimums.
All global mean average temperatures are especially nonsense when you have 0 data for 90% of the planet 100% of the time.
But that’s what you get when you deal with globalist science.
Sometimes they have data for only a handful of years as with the ozone layer.
Iirc 1969, yet they already knew in the mid70ies 100% for sure that CFC’s are to blame for the holes.
Then they have arctic sea ice data only going back to 1979 (which is quite interesting as the arctics are right below the ozone hole,therefore data should go back to at least 1969)
yet the Nimbus satellites started in 1964.
Or, to put it even more simply, “Average temperature” is a meaningless calculation!
No less meaningful than average height or average family size. Nobody family 2 1/2 children, but it is still useful to track whether or not that number is changing over time. And nobody uses the same tape measure to take every single height measurement of a sample.
Irrelevant drivel.
The average human has a one testicle. Useful data indeed.
If we observed a change over time in the average number of testicles, it would indeed be useful data.
Did you read Clyde’s message? The family size of 2.5 children really *is* useless information since it tells you nothing about the standard deviation or variance of the data. You could see drastic changes in family sizes with no change in the “average”. You could have more families with no children, more families with more children, and the average could remain unchanged.
It’s why following the mid-point temperature gives a useless piece of information. It simply doesn’t tell you anything about what may have changed or if anything changed at all!
Tracking a single metric is never in any circumstance going to tell you everything you need to know about underlying drivers of population or system dynamics, and nobody is claiming otherwise. The claim is simply that tracking a mean can be a useful way to see if something is changing, giving you an incentive to investigate.
“nobody is claiming otherwise”
Climate science claims it every single day!
“tracking a mean can be a useful way to see if something is changing,”
As usual, you didn’t even read my comment. Your assertion is a typical climate science meme that demonstrates exactly zero understanding of data and its associate statistical descriptors.
Take the CAGW meme that a rising global temperature average is BAD, thus the “catastrophic” adjective! It’s turned out to be a benefit instead – a greener earth, more food, fewer cold deaths, on and on and on …..
It’s all because CAGW had *NO* idea of what was changing the average!
Climate science never claims this. Try picking up an assessment report sometime and reading it, you will quickly see that it discusses more than the global mean temperature.
Observing any change across any metric will never tell you why the change is occurring. Observations of change just tell you that something is happening. Then you do all of the usual science stuff to try to explain why it is happening.
Except anomalies have no common base temperature so they can not be used for a comparison of warmer or colder.
They can be used to compare relative rates of change between locations and over time, which is exactly the thing we want to be able to use them to do.
The problem is that they are treated as temperatures. Too many people interpret a larger anomaly as meaning the entire earth is warmer.
I don’t think this is a very widespread point of confusion. It’s like saying we shouldn’t track the average family size because people will think that any positive change in the average family size means that every family on earth is getting larger. Some people might not grasp the meaning of it, but they are a pretty small uneducated minority, and their presence should not stop us from using useful metrics.
When did I ever say anomalies should not be used? What I have said is that they are too many times used to gauge whether one place is warmer or colder than another.
I also don’t believe that the variance is appropriately stated for anomalies. The basic assumption is that all temperatures have no uncertainty in the interim calculations and that uncertainty is only calculated when the magnitude of the final anomalies have been reduced by at least an order of magnitude, thereby making the numbers look very, very accurate.
Can you actually point to an example of this happening? Especially in the research literature?
This is not an assumption that I’ve ever seen made.
Jim doesn’t have a clue of how anomalies work.
Look at his reply dated November 5, 2025 9:21 A.M.:
I don’t disagree. However the constant must be the same for all anomalies. The last time I looked, UAH uses different baseline values for each segment, not one constant value across the board.
Otherwise, simply looking at quoted anomalies as an indicator of temperature, you end up with the Tropics being cooler than the Global value or the Arctic being warmer than the Tropics as the October values indicate.
The whole purpose of using anomalies is that they share a common baseline. Climate scientists obviously know this and using different baselines would defeat the entire method.
Furthermore, anomalies do not indicate that the Arctic is physically warmer than the Tropics. It shows the deviation from each region’s own long term average.
I don’t think the average climate contrarian is less intelligent than anyone else. They instead suffer a bad case of Dunning Kruger.
Funny how you never got around to addressing the issue.
Read what I said closer. People, even folks here, use anomalies as an indicator of temperature. In other words, an anomaly of 1.47 is warmer than an anomaly of 0.43. That is a faulty conclusion.
It appears you are more interested in creating a strawman that you can use to make an ad hominem. Hate to tell you, that doesn’t make you much of a scientist.
I replied to that comment in the other thread. It is you who didn’t address the issue by not replying.
“he whole purpose of using anomalies is that they share a common baseline.”
Poor statistics! Anomalies taken in the NH during summer and in the SH during winter inherit the different variances for the hemisphere, i.e. temperature variance in cold weather is larger than in hot weather.
It doesn’t matter if you are using a common baseline if that baseline is incorrectly developed by not weighting contributions to the average based on the different variances of the data.
And it’s not just the hemispheres. The temperature variance on Pike’s Peak is different than the temperature variance in Colorado Springs. Yet both sets of data are lumped together in the same data set with no weighting.
If the baseline itself is garbage then the anomalies calculated from it are garbage as well.
NOTHING in climate science is collected correctly, stated correctly, or analyzed statistically in a correct manner. It’s all 3rd grade math – it’s non-justified simplification.
If you think this is a serious statistical issue, then run the test yourself:
Recompute UAH using variance weighted grid cells as opposed to the area weighted method and show us the difference. If the global trend collapses under that method, it will be obvious. If it barely changes, then the argument dies on contact with data.
I don’t NEED to run anything. Area weighting is used to reduce SAMPLING error, not measurement error. Variance is associated with measurement uncertainty, not sampling error. They are two entirely different things.
Variance is a metric for uncertainty. Data with different variances have to be weighted in order to properly account for the uncertainty.
If one piece of data has a larger variance then it has more uncertainty than one with a smaller variance. Averaging the two, or even just comparing them, means nothing if the difference in variance is not accounted for.
If you average 5 +/- 1 with 6 +/- 2 which one is the most accurate? If you just average 5 with 6 and get 5.5 then you are giving each equal representation in the average when the 5 +/- 1 should be given a higher weight as being the most accurate.
Yet climate science just ignores this simple statistical fact. It’s all part of the meme that all measurement uncertainty is random, Gaussian, and cancels.
Uncertainty doesn’t fully cancel out, but the large number of samples greatly reduces it in the final estimate. You might find this a good read:
https://web.archive.org/web/20080402030712/tamino.wordpress.com/2007/07/05/the-power-of-large-numbers/
Malarky! All the larger samples allow is a more precise locating of the average. It does *NOT* reduce the measurement uncertainty associated with the data from which the average is calculated.
A distribution of measurements with a common systematic measurement uncertainty cannot see the variance of that distribution reduced by taking larger samples.
You are making the very same mistake that *all* statisticians seem to. The SEM is *NOT* measurement uncertainty, it is SAMPLING error.
You can reduce the sampling error by using larger samples but you CAN NOT reduce the measurement uncertainty.
Again, the measurement uncertainty is related to the variance of the data set. Larger samples will *NOT* change the variance of the data set.
Your reference has one big flaw when comparing it to averaging temperatures from different weather stations. The Tamino paper uses ONE SINGLE star. In other words, repeatable conditions that require measuring the exact same thing multiple times. He does not appear to deal with the uncertainty that can be introduced by using different measuring devices.
Your reference to large numbers only apply if the same thing is being observed a large number of times. Otherwise, Tamino could have averaged all the stars in the sky and arrived at a very good estimate, right?
Let’s talk about the weak law of large numbers (LLN) which is appropriate for probabilities. The term ‘probabilistic process’ is defined as the probability of something occurring in a repeated experiment. The flipping of a coin is a probabilistic process. That is, what percentage of the time does heads or tails occur. Rolling a die is a probabilistic process when you are examining how often each number occurs. So fundamentally the weak LLN deals with probabilities and frequencies in a process that can be repeated. It is important that the subject remain the same. For example, if I roll 10 dice 1000 times and plot the results, it could happen that I will have different frequencies for the numbers 1 – 6 because of differences in the dice.
Now, how about the strong LLN? The strong LLN deals with the average of random variables. The strong law says the average of sample means will converge to the accepted value. However, in both cases these laws must meet a requirement that is known as Independent and Identical Distributions (IID) for the samples. What does this mean?
Independent means that each choice of a member of a sample is not affected by a previous choice.
What does identically distributed mean? Each sample must have the same distribution as the population.
Now, how does all this apply to atmospheric temperatures? The weak LLN would be useful if we could expect measurements to follow predictable outcomes. For example, each day of the week having an expected temperature. In this situation, one would expect each temperature to appear 1/7th of the time. Temperatures don’t work that way. Temperatures are not unique finite numbers, each of which has a unique probability or frequency associated with it. Temperatures are an analog, continuous function with with a constantly varying value.
The strong LLN might be useful if the “samples” were from the same station, but identical distributions from summer vs winter stations, different microclimates, different measuring devices, is a bit much.
The strong LLN only promises that the sample means distribution will be Gaussian and provide an accurate estimate of the population mean as described by the standard deviation of the sample means, i.e., the SEM.
Lastly, measurement uncertainty is the dispersion of measurement observations that can be attributed to the measurand. If you make the SEM small enough, you should recognize, there will be no measurement observations in that interval. Does that even make sense? No. That is why the GUM specifies the variance of the observations be used as the measurement uncertainty.
“Lastly, measurement uncertainty is the dispersion of measurement observations that can be attributed to the measurand. If you make the SEM small enough, you should recognize, there will be no measurement observations in that interval. Does that even make sense?
Perfect! That’s why the average is *NOT* a measurement nor is it actually data, it is a descriptor of the actual data. It is the variance of the parent distribution that determines the measurement uncertainty, and the SEM does *NOT* estimate the parent distribution variance, it is a metric for how precisely you have located the parent average. It doesn’t matter how small you make the SEM, it won’t change the parent distribution variance! Nor will it estimate the measurement accuracy of the so precisely located average – the measurement accuracy of the precisely located average will remain the variance of the parent distribution.
You and every other statistician misunderstands the issue. The issue is the uncertainty and precision with which anomalies can be averaged. If uncertainty is propagated correctly it is likely that the uncertainty is in the tenths digit which would make anything below that unknowable.
Here is an excercise you can do to show you know how measurement uncertainty works. Pick a location from 1950, choose a month, find Tmax average, use whatever baseline you chose, and show us the math that lets you calculate an anomaly with 1/100th or even 1/1000ths.
If every statistician on earth disagrees with your opinion of statistics, it’s probably worth reflecting on whether or not you understand statistics.
No, it is probably worth the statisticians reflecting on how you can change the variance of a distribution by subtracting or adding a constant. It is probably worth the statisticians reflecting on how sampling error becomes measurement uncertainty.
Can you imagine the poor Engineering Statistics TA who had them in her class!? I’m sure they whined holy hell to the Dean when they got their grade. And thank the Imaginary Guy In The Sky that they are in the demo that predated Glass Door by a half century….
Ad hominem’s really make you looks bigboob.
FYI, when I graduated with my engineering degree, they were still teaching error analysis. It is why I understand the difference between errors and uncertainty. You obviously don’t because you never tell us what the difference is.
As usual you never address the issue but just use a smart remark as a deflection. Do you know how dumb that is.
Now tell us how many statistical courses you took that address measurement uncertainty and how it is handled.
Really? The rate of change in the temperature at Pikes Peak can be compared to the rate of change in the temperature at Colorado Springs? Exactly what will that comparison tell you?
How about the comparison between Las Vegas and Miami?
Of course they can be compared. What the difference can tell you depends on what you’re studying. Comparing rates of change across the high northern latitudes to the tropics reveals Arctic amplification, for example.
Typically it is not useful from a climatological perspective to compare trends between two point locations, and scientists more often look at weighted regional means.
You never actually say what the comparison of Pikes Peak with Colorado Springs tells you!
Since temperatures at different latitudes have different variances how do you directly compare them?
Then how does the “global average” work since it is nothing more than comparing trends from multiple point locations?
It would tell you the difference in the trend between those two locations. If you want to understand the reasons for the observed difference, you will start doing some scientific investigation.
The global average is not a comparison of point locations, it is the mean of weighted gridded averages.
“All global mean average temperatures are especially nonsense when you have 0 data for 90% of the planet 100% of the time.”
Wouldn’t matter even if you had 100% coverage. Intensive properties.
It isn’t quite that simple. If one cuts a cube of gold in half, the density of both halves are the same. If one samples a parcel of air, sampling at the leading edge of a cold front will almost certainly give a different temperature than sampling at the trailing edge. That is, where one samples a heterogeneous ‘object’ is very important and taking an average may be the only way to get a rough estimate of the character of the ‘object.’
This would require that the gradient between PointA and PointB be linear thus the average describes the “mid-point” of the gradient. This is simply not a justifiable assumption for most intensive properties and is not justifiable for temperature. Atmospheric temperature gradients are simply not linear either horizontally or vertically. It’s one more simplification assumed by climate science that is not physically appropriate in the real world.
Yours is the comment that sums up the argument for the Averagers. The average temp between leading and trailing edge of a weather front should be, on average, the same as the average surrounding temp.
Therefor, Bob, on average, how the heck do you know, using averages, that the average weather front even exists?
Someone here once compared it to adding up all the license plates in your town, to find the average license plate for said town. Possible, precise, accurate, useless.
Temperature between locations represents a gradient. There is no guarantee that the gradient is linear. I like using the comparison of hiking between two locations on the flatlands of Kansas verus hiking between two locations the same distance apart in the hills of southern Missouri. The energy spent in covering the distance is vastly different because of having to go up and down the terrain. Similarly, trying to say the average of the temperature between two locations is (Point 1 + Point 2) / 2 is just ludicrous. It’s non-physical.
Might be before your time, but during the International Geophysical Year (1957-58), there was an elevated presence of scientists in Antarctica. Supposedly, the so-called ozone hole was observed then, but I haven’t tried to run down the details on it.
What everyone keeps forgetting is that to mathematically justify increasing the precision of a series of measurements, the measurements must have the property of stationarity, meaning that the mean and variance don’t change with time, which is rarely the case with a time-series because the point of a time-series usually is to record how the dependent variable varies with time. Temperature measurements are one of the most common forms of time-series data. There are ways to transform the data to be stationary, but then what does it mean when de-trended? It simply means the average (mid-point value) of the entire set, which still might change if more readings are taken.
As you point out, even the same parcel of air, if it could be tracked without changing any properties, as with the conundrum of the Heisenberg Uncertainty Principle, is in a constant state of flux, changing pressure and humidity. Therefore, strictly speaking, even what might appear to be the same parcel of air is always changing. Thus, it does not meet the requirement of measuring the same thing multiple times. The requirement is present to insure that any changes in measurements are random, and not part of a trend.
One can ‘define’ a set of data as being weather temperature recordings, from a station or multiple stations, and taking an average makes sense in that context. However, that does not justify manipulating it further. What one has is the average of station measurements, at the level of precision originally measured, along with its inherent uncertainty, with a formal propagation of error for the set used to calculate the average.
It is so clear that when one calculates the average weight of a basket of apples and pineapples, they have the average weight of a basket of fruit, not the average weight of either the apples or pineapples. Yet, alarmists seem unwilling to accept the obvious. Using a larger basket may give a different result, but provides no more information about the average weight of the different kinds of fruit.
About 1860 is generally considered to be the end of the “Little Ice Age”.
The warming appears to have been beneficial for Swiss real estate.…Interesting article here:
https://journals.sagepub.com/doi/10.1177/09596836221088247
Actually the Greenland ice cores show that 1875 was the coldest yeat in the last 10,000 years. From Jørgen Peder Steffensen, of Denmark’s Niels Bohr Institute, one of the most experienced experts in ice core analysis, in both Greenland and Antarctica.
“NorthGRIP the Greenland ice core project is being reopened to drill the last few meters through the ice sheet to the rock beneath the research station. The ice core over three kilometers in length has been hauled up to the surface piece by piece, and contains important data on the history of the climate of the earth. It bears the fingerprints of climatic conditions over more than 120 thousand years.”
“Now as we go forward to approach our time, we can see that in the period after four thousand years ago and up to the two thousand years ago (which is actually the Roman Age) the temperatures have been decreasing in Greenland by two and a half degrees. Then temperatures increased gradually up to a maximum point around a thousand years ago, we call it the Medieval Warm Period. And then temperatures declined and go down to minimum around 1650 a.d., before coming back up a little in the 18th century.
And then around 1875 we have right here the lowest point in the last 10,000 years.
And that matches exactly the time when meteorological observations started.”
The problem is that we can all agree completely that we have had a global temperature increase in the 20th century. Yes, but an increase from what? It was probably an increase from the lowest point we’ve had for the last 10,000 years. And this means it will be very hard indeed to prove whether the increase of temperature in the 20th century was man-made or it’s a natural variation. That would be very hard because we made ourselves an extremely poor experiment when we started to observe meteorology at the coldest time in the last ten thousand years.
https://rclutz.com/2023/05/05/1875-was-coldest-in-10000-years-warming-a-good-thing/
– About 1860 is generally considered to be the end of the “Little Ice Age”.
– Actually the Greenland ice cores show that 1875 was the coldest yeat
Do the ice cores have <15 year resolution? I thought it was much coarser than that.
Tony, Steffensen said “around 1875, so 1860 is close enough.
Steffensen said “around 1875,
Apologies, I missed that in the quoted portion. Was just thinking that it seemed a bit tight resolution for what it was 🙂
Nevertheless, “Large blocks of ice…” went past New Orleans on February 17, 1899 reaching the Gulf two days later where inch thick ice formed at the mouth between the three passes.Henry, A. J. 1899. The weather of the month. Monthly Weather Review. 27(2):50-53.
It’s simply not possible to derive measurement accuracy to more decimal places than the originally recorded data. What was it mostly in that early era? 1 decimal point at best? Whole degrees? Half degrees? Quarter degrees?
If these people used the same funky math to calculate how fast horses ran to five decimal places back when they measured time in 5ths of a second or how fast Edward McGivern could shoot a revolver to that many decimal places when he could fire one so fast that even equipment able to record in 100ths of a second couldn’t keep up – everyone would call it out as absurd.
Same for anything in the past measured with much less precision than is commonly used now.
But with past temperature these people throw out numbers like 1850 -0.42648312 that were absolutely impossible to measure that precisely in 1850, and a lot of people believe it.
In case others didn’t know: “Edward McGivern (October 20, 1874 – December 12, 1957) was a famous exhibition shooter, shooting instructor and author of the book Fast and Fancy Revolver Shooting.“
“simply not possible to derive measurement accuracy “
They are not quoting measurement accuracy. They are quoting the uncertainty of a calculated result, which is the average of a very large number of anomalies.
Yes. Just divide by the number of stations, accuracy in extremis
By the sqrt(N), roughly.
But N was about 10 in this case so your reduction in uncertainty was small at best. This is a clear case of expressing a result far beyond its significant figures. This used to get you a bad mark in university.
No, N is the total number of measurements – thousands.
In fact measurement uncertainty is a small part of the uncertainty of the average. The biggest cpmonent is location – how much different would have been the result if you had measured in different places.
That is covered in the GUM JCGM 100:2008.
This belies your assertion that dividing by the √n provides a good indicator of uncertainty.
The SEM, which is what you are referring to only provides an interval that the average may be located in. It has nothing to with measurement uncertainty which is based on the variance of multiple measurements, that is, the experimental standard deviation of the measurements.
The theory is that N is the number of measurements of the identical thing, which allows the assumption that averaging an increasing N gets closer to the “real” measure. But, “averaging” a bunch of different thermometer readings, from different places at different times has none of the qualities that are required to give N that significance.
“N can’t be “Thousands” of measurements from 1850. I’m almost certain there weren’t “Thousands” of measurement stations available then.
Mr. A: As I recall, Mr. Stokes went down in flames trying to defend Mann’s hockey stick, when (after digging and digging) McIntyre/McKitrick showed Mann had reduced “conforming samples” to, like, one tree. Ever since, the goal for CliSci and Mr. Stokes is to claim “thousands” of samples support Mann’s work. The challenge is to find “conforming” proxies, and hide the rest, and that’s what they do.
One Tree to rule Yamal
One Tree to find them
One Tree to call Yamal
and in Climate SyFy bind them
Using different instruments for each measurement negates the ability to increase accuracy. If you want to do that you have to use the same instrument for every reading, an impossibility.
This is blindingly obvious to anyone with a background in Engineering or the Physical Sciences, but apparently not to Stokes or bdgwx.
The SD divided by the square root of N only quantifies the uncertainty of the average of the numbers averaged. It does not account for the uncertainty of those numbers. To obtain the correct estimate of the overall uncertainty at a specified level of confidence you must combine the uncertainty of the average with all other sources of uncertainty which includes the MU of the instruments, systematic error, and potentially many others. In my experience it is very rare to find a thorough measurement uncertainty budget in any research paper other than those produced by calibration agencies and nation standard weights and measures bodies such as NIST. Almost all academic researches report an incredibly optimistic uncertainty or none at all.
No, N=1, maybe 3. you don’t have thousands of measurements, you have thousands of different things you have measured. Each measurement isn’t a measurement of the existing population, it is a measurement of a new addition to the population.
It strike me that is is not just sampling with replacement, it is sampling with replacement and adding more marbles to the bowl.
Only applies when working with the same instrument, measuring the same thing.
It does not apply to hundreds of instruments each measuring an independent piece of air.
Not to mention that a temperature change at the equator has a completely different meaning than the same change in the arctic. Which is why averaging temps is absurd and averaging anomalies is even more absurder.
Nonsense. It is just the result of combining random variation of any kind. It is the arithmetic of cancellation.
Which has nothing to do with sparse inconsistent surface temperature data.
You never did any engineering, did you Nick. !!
You really don’t understand “measurement” at all !!
That left a mark!
This is the logical error that pervades ALL of climate science.
It is a mark of the mathematical ignorance behind the whole façade.
You CANNOT use the rule of large samples for sparse, inconsistent temperature data.
Measurement uncertainty has nothing to do with how accurately a mean has been calculated. The mean is only useful in locating the center value of an interval that describe the possible values a measurement may have.
Once you have reached the number of significant digits in the actual measurements, the SEM has no further value.
You do not understand the rule. To use it on temperature data the dataset is only the measurements taken at the same location simultaneously. There are no such meteorological data.
The usual climatology line: “all error is random, Gaussian, and cancels!“
I’ve also noticed that this assumption of theirs is never confirmed or validated. They just make the declaration and expect everyone to go along with it. Pretty typical of all so called climate science. They never make any effort to validate their assumptions.
All they do is ignore all the objections then circle back to the beginning and repeat the same old misused lines, asserting themselves as the authorities which cannot be questioned.
That is only true if you use a constant value for a baseline, such as 15°C. That would allow using the variance of the calculated values to describe the distribution of temperature change across the globe.
To say that the temperature at 60°N is warmer than one at 0° because of a larger anomaly is totally asinine. The growth in temperature may be larger in one place than another, but it is totally incorrect to call a larger anomaly “warmer”.
Jim, you raise a point that is rarely addressed. In a rigorous analysis, one has to be concerned about the propagation of error in calculating the historical baseline and the subsequent subtraction to derive the anomaly. On the other hand, if one picks some arbitrary temperature, such as an assumption of what the temperature was at a certain year in the past, then one can define the baseline as being a constant with infinite precision, such as an integer is assumed to have. Furthermore, one then has the flexibility to choose a baseline that won’t result in a loss of digits resulting from the subtraction(s).
How about zero as the baseline? 🙂
This is nothing more than the idiotic meme in climate science that all measurement uncertainty is random, Gaussian, and cancels.
Temperatures do not have Gaussian characteristics. They are sinusoidal during daylight and exponential decay at night. Measurements take on the same characteristics even when taken at night”random” times. Cancellation simply cannot be assumed.
All operations have rules for when they work and when they can’t.
Your insistence on ignoring these rules is just more evidence of how ignorant all climate alarmists are of even basic statistics.
Numbers is just numbers, right?
And since this is a dynamic phenomenon that varies with time the measurements need to be simultaneous as well. So multiple samples at the same location and the same time. No such thing in meteorology.
A more applicable metric would be if every measurement station made two daily measurements, one at 0000GMT and one a 1200GMT. No need for a Tmax or Tmin, just temperature measured simultaneously everywhere.
WRONG… This is a massive mathematical error made by climate NON-mathematicians.
You cannot use the “rule of large samples” with sparse, erratic, always changing surface measurements.
https://wattsupwiththat.com/2025/11/10/the-curious-case-of-the-missing-data/#comment-4129873
Most of those anomalies are also FAKE, and/or have been “adjusted”.
South Africa shows the MEASURED warm period around 1930/40, so does some data from South America, Chile etc.
There is basically no unusable data for the Southern oceans, even CRU says it was “mostly made up”
Darn , Last line should read…
There is basically no
unusable data for the Southern oceans, even CRU says it was “mostly made up”“almost nothing from the oceans”
That’s what this article says, too.
Ocean temperature data for this time period is worthless and should not even be considered.
And since there is no data from most of the oceans, and only sparse, clumped data for surface data…
There is absolutely NO WAY you can even pretend to construct a meaningless “global average temperature”.
The whole “climate” scam is built on empty air. !!
“unfit for intended use”
Nick,
How is it possible to have any kind of global average temperature when vast swaths of the southern hemisphere had no accurate temperature measurements at all before the 1920s?
Simple – fantasize.
Stokes’ posterior will supply the missing data.
Kinda hard to pull the data out of the sphincter with the head so firmly entrenched.
Not to mention the oceans that cover – what, 75% of the planet?
70% oceans, 1% lakes and rivers
This is why you very rarely see this cohort posting outside of this forum. It’s the rest of the world that’s getting it wrong, not these few dozen WUWT mutual support clubbers.
Bob,
How is it possible to have any kind of global average temperature when vast swaths of the southern hemisphere had no accurate temperature measurements at all before the 1920s?
Most of us have tried. The rest of the world cancels the accounts of anyone who challenges the basic assumptions of the climate religion.
A little Dan Kahan System 2 Motivated Reasoning confused. They draw the line at Engineering Statistics 101 denial.
As it were, we finished Death by Lightning last night. I’d read the related book, and we were both impressed by Matthew Macfayden’s Charles Guiteau. He would have been right at home here…
Well that was a load of gibberish !!
As we have come to expect from you.
Word salad Bob. He gives Kamala a run for her money.
Perhaps a psychiatrist might help.
Mr. bob: Did you intend this post for some furry tiktok group?
Mr. bob: Thank you so much for stopping by to bust our bubble! Your comment certainly demonstrates how wrong this cohort gets measurement uncertainty, but your math is too highly developed for us clubbers, can you please dumb it down to Mr. Stokes’ level?
The usual waste of space, bob.
Because, on the rare occasion you manage to trick us into opening your drivel, your type erase anything we type.
Does an anomaly carry a label of degrees F, C, or K. If so then they are depicting a measurement of a physical quantity and should follow scientific protocol in describing the value and uncertainty.
Farenheight
Universal
Centigrade
Kelvin
we need a Universal temperature measurement system
in other words, bull s**t all the way down.
The SEM is useless for determining accuracy. There is no actual physical science use for it being associated with a measurement quote. The SEM is a metric for SAMPLING uncertainty and, at best, should be given as an additive uncertainty to the measurement uncertainty. Measurements should be given as “ estimated value +/- MEASUREMENT uncertainty”. Not as “estimated value +/- SAMPLING error”.
To derive an ‘anomaly,’ it is first necessary to calculate the mean, and SD to estimate the precision of the baseline; historical data typically have less accuracy and precision than modern data. One then has to similarly calculate the mean and SD for the mid-range temperature (diurnal high and low). Then the modern number of interest has to be subtracted from the corresponding baseline number.
The older data very reasonably may have a precision that is one or two orders of magnitude less than the modern data. Therefore, the modern data have to be truncated to the same number of significant figures as the historical data before subtraction. Regardless, in the subtraction procedure, one or more significant figures will generally be lost; the smaller the difference between the historical data and modern data, the more digits that will be lost.
Thus, the process of deriving an ‘anomaly’ tends to reduce the precision rather than increase it.
Folks who have no experience in measurement just don’t understand the difference between an analog meter that has a 2% full scale uncertainty and moving to a digital meter with four decimal digits. You just can’t add zeroes to the old analog data to make it match up to the resolution of the new meter! That is unethical.
I never had a single class in statistics, but even I know you can’t average the temperature of horse poop and asphalt six hours after dark. So what was the average temperature of the average square mile of asphalt and concrete skyscraper in Chicago 1857?
Maybe we can plant some weather stations on the piles of bovine excrement emanating from Stokes and his messiah Mann hanging on his hokey schtick. You know, to simulate 1800’s downtown…
Actually, it is possible to derive measurement accuracy to more decimal places than the originally recorded data under some circumstances: if you have lots of measurements, and if what you’re interested in is averages, and if you have good reason to believe that the measurement inaccuracies are random and uncorrelated rather than systematic. It’s called the “standard error of the mean” (SEM).
In general, if you average N independent measurements with random (non-systematic) errors, the standard deviation of the mean is reduced by a factor of (1 / √N) = (N⁻⁰·⁵).
So if you have a measurement X with a standard deviation σ, by taking 100 measurements you can reduce the standard deviation by a factor of ten, if the errors of the individual measurements are known to be random and uncorrelated.
It is also perfectly acceptable to give average figures to many more decimal places of precision than is known, as long as you also provide realistic confidence intervals or uncertainties. Specifying a calculated value (such as an average) to a limited number of “significant digits” is just a shorthand notation for specifying rough precision, when not giving the precision explicitly. If the confidence intervals or uncertainties are given explicitly then there’s no need to truncate or round the calculated value to a limited number of significant digits, and, indeed that is a bad practice, because doing so slightly worsens to accuracy of the data.
Of course, if anyone reading all that thinks it means we could know global average temperatures in the late 1800s to anything like the precision which Hadley Centre and the CRU pretend, then perhaps I can interest you in a very nice used bridge.

and completely pointless.
A wonderful concept, but cannot be relied upon in any practical sense. Sorry about that, but measurements are real – if the measurements don’t fit your pre-ordained thoughts of what they “should be”, then a re-think may be in order.
As Feynman said –
Averaging temperatures is just stupid – creating a “temperature” which doesn’t exist. As Zero Six Bravo in Iraq proved, belief that “average temperatures” provided by the Met Office were useful, turned out to be fatal.
Yes, ignorance and gullibility can kill.
To justify dividing by the square root of the number of samples, to improve the precision, a data set must have the property of “stationarity”
Surface and ocean Temperature reading DO NOT meet this property.
Consequently, one has to propagate the uncertainty in quadrature — It is a complicated formula that requires the calculation of the mean first, but a reasonable approximation can be obtained as “average the squares of all the uncertainties and take the square root”.
I hear it also comes with a toll booth to provide a steady income
There was a guy who actually sold the Eifeltower – twice.
Therefore Ill take that bridge.
You can’t average intrinsic elements – meaning you can’t average temperature, even using measurements from only one thermometer. Each reading is an approximation with plus or minus one last digit, because it can’t be known if it is about to click up or down one last digit.
Actually, you can. If a high-quality instrument is used, one will get the exact same reading every time on the same measurand. When averaged, one will get the same number as each of the samples. However, it the thermometer is fluctuating a little, one will get slightly different temperatures each time. The average will be close to (but probably not exactly) the same. However, there will be a standard deviation that will allow one to estimate how frequently certain temperatures will be observed.
You have missed some requirements in order to use the SEM.
Please note: the SEM is only useful in describing one item such as a single bolt, a single unit of mass, or the wavelength of a constant frequency.
What are the requirements?
The SEM becomes the standard deviation of the distribution created by the means of the samples.
If the sample means distribution has a non-zero value for the standard deviation, the means of each sample are not the same. This indicates sampling error.
Sampling error also means that using the value “σ/√n” to obtain the SEM has uncertainty itself.
If all the requirements are met, one can use the SEM as the measurement uncertainty FOR THAT SINGLE ITEM. One can not use it as the measurement uncertainty for the next item.
Not worth going round in circles again. But just about everything Jim says here is wrong. He simply doesn’t understand the subject but thinks he’s right about everything. A dangerous combination.
Specifically.
“Measuring the exact same thing.”
Wrong. Not only can you have the mean of different things, it’s the main use case of the SEM.
Alternatively, you could say that what is being sampled is the population mean and each value is a “measurement” of that one thing. But Jim keeps confusing these different ways of looking at it, and thinks the SEM is only about measuring a single object.
“Measuring with the same device and procedure.
Measuring under similar environmental conditions.”
Same point. It might be best if you try to make your measurements as similar as possible, but there is no requirement, and in many cases that’s impossible.
“Multiple measurements of that same thing grouped into samples of size “n””
And this is were Jim keeps going completely off the rails. He thinks that because you can think of a sampling distribution in terms of what you get if you take multiple samples, that that means the only way to calculate the SEM is to take multiple samples. I’ve tried to explain. every way I can, that this isn’t something you need to do, and that it would usually be a pointless thing to do, but Jim will never accept it.
“Each sample should have the same mean and variance, i.e., IID.”
And this weird confusion came up the last time we discussed it. He just doesn’t understand what iid means. He thinks it means that two samples have identical distributions, and will never accept that it refers to the probability distribution from which each value in the sample came.
“The SEM becomes the standard deviation of the distribution created by the means of the samples.”
The SEM is the standard deviation of the sampling distribution. The SD of a large number of samples will tend to that value.
“Sampling error also means that using the value “σ/√n” to obtain the SEM has uncertainty itself.”
Which is why you want to use a student distribution, or more advanced methods. But for reasononably large sample sizes the estimate given by sample SD divided by root N is usually a good enough estimate.
‘The SEM is the standard deviation of the sampling distribution. The SD of a large number of samples will tend to that value.’
First, the standard deviation (SD) of a sample is an estimate of the variability of the elements within that sample. The standard error of the mean (SEM) is an estimate of how likely the calculated mean of that sample is correct. Since SD and SEM measure different concepts, your statement that SD will ‘tend’ to SEM is nonsense.
Second, sample temperatures at, say, Kalamazoo and Timbuktu are obviously drawn from different populations, hence believing that their combined ‘anomalies’ have any meaning is idiotic.
“First, the standard deviation (SD) of a sample is an estimate of the variability of the elements within that sample.”
You are mixing two different things. The sampling distribution is made up of the means from multiple samples. It is not the mean of a single sample.
the mean of the Sampling Distribution will, based on the Central Limit Theory and the Law of Large Numbers, tend to a Gaussian distribution whose mean approaches the mean of the population. How closely it approaches the population mean is based on the size of the samples making up the sampling distribution.
Bellman is still arguing from the meme of statisticians that “numbers is just numbers” and don’t have to be analyzed under real world conditions. E.g. one sample is all you need to accurately estimate the mean of a population. This means that you can *always* just assume the SD of the sample is equal to the SD of the population. Then the SEM becomes the SD_sample/sqrt[n]. It simply doesn’t matter to the statistician that in the real world the population SD and the sample SD may not be the same and this INCREASES the SEM in the real world.
Not wrong! Why do you never show any resources to bolster your assertions? Do you find Metrology too mundane for a statistician?
As before, lets go through the proof from Dr. Taylor who teaches uncertainty, which you do not.
Let’s examine this point by point.
5.7 Standard Deviation of the Mean
So we have a normal distribution pf random measurements around a true value. The basic assumption in the GUM is that we do not know the true value, we only know an estimated value.
And we end up with a number of experiments, each with N measurements, and each with an average and σ.
Tell us your definition of what “measurements of the same quantity” actually means. I am very interested in how you manage to wrangle a meaning into something different.
σₓ₁ = … = σₓₙ = σₓ.
Thus the SEM becomes the σₓ/√n.
You can pretend that “similar” things can be substituted for the requirement of the same quantity. However, that by itself introduces the need for performing additional experiments on the similar things and proving that their true values and distribution width σₓ are exactly the same.
Then you need to tell us how this is accomplished when measuring temperatures at different times and different places with different true values.
This derivation is found elsewhere on the internet. I’ll leave it up to you to spend the research time that I have to find them and study why they use the same requirements and specify using the same quantity, multiple measurements of the same thing under repeatable conditions.
When you can find a metrology reference that says something different, you can post it here. I will be happy to see it.
“Why do you never show any resources to bolster your assertions?”
Because I’d have hoped you would have learnt something by now. OK, some references –
You claim you can only use the SEM when measuring the same thing.
https://www.statology.org/concise-guide-standard-error/
Different customer scores, not the same customer satisfaction score.
You say you need “Multiple measurements of that same thing grouped into samples of size “n””
…
https://www.greenbook.org/insights/research-methodologies/how-to-interpret-standard-deviation-and-standard-error-in-survey-research
“As before, lets go through the proof from Dr. Taylor who teaches uncertainty, which you do not.”
You keep pointing to an example from metrology. This is one application of the SEM – repeatedly measuring the same thing to get a more precise measurement of that thing. Your problem is you can’t accept that this is not the only use, that it applies just as well to sampling different things from a population.
“The basic assumption in the GUM is that we do not know the true value”
Of course you don’t know the true value – there would be no point in sampling if you knew it.
“we naturally imagine repeating our N measurements many times”
Operative word being “imagine”.
“And we end up with a number of experiments, each with N measurements, and each with an average and σ.”
You do this in your head in order to reach the proof.
“Tell us your definition of what “measurements of the same quantity” actually means.”
I’m not the one claiming they all have to be the “same thing”. As I said, you can imagine the mean of the population is the one thing you are measuring and each value in your sample is a measurement of that one thing, but that’s obviously not what you mean by the same thing.
The point about sampling is you want all the values to be iid, independent and identically distributed. If you measure the same thing with the same instrument and with the same uncertainty, then all your measurements are random values from identical distributions. In the same way, when you take a random sample from a population, each value is a random value from an identical distribution – namely, the distribution of the population.
“Because x1,,…, xN, are all measurements of the same quantity x, their widths are all the same and are all equal to σₓ, ”
In other words, identically distributed.
“Thus the SEM becomes the σₓ/√n.”
Well done. Now explain your proof that shows this would not be the case if you had iid values randomly selected from a population.
“You can pretend that “similar” things can be substituted for the requirement of the same quantity.”
Why pretend. That’s what the maths says. Probability theory works in the same way.
“However, that by itself introduces the need for performing additional experiments on the similar things and proving that their true values and distribution width σₓ are exactly the same.”
You keep failing to understand what iid means. I keep trying to tell you, it relates to the probability distribution of each value. When you take a random sample each value is a random value from the distribution defined by population. It does not mean that each value has to have the same value, just that it comes from an identical distribution.
“Then you need to tell us how this is accomplished when measuring temperatures at different times and different places with different true values. ”
I keep telling you, but you don’t hear.
1) Each measurement is a value from the population. What population depends on what mean you are talking about. It could be the temperature at a single location for each day of the month, or it could be the temperature over the entire globe for a specific day, or any other region or time scale you are interested in.
2) As I also keep having to explain, any actual temperature data set is not a random sample. You do not calculate the mean just by averaging all the values, and you do not calculate the uncertainty by dividing the standard deviation by root N.
Great strawman you’ve created.
Now tell us how this relates to evaluating an input quantity as defined in the GUM.
“Great strawman you’ve created.”
In what way is it a strawman? You said
I gave you a simple example where it was useful when not describing one item.
If you think I’m misunderstanding what your claim is, maybe you need to be clearer about what you are saying.
“Now tell us how this relates to evaluating an input quantity as defined in the GUM.”
What has the GUM got to do with this? We were talking about the SEM. The GUM doesn’t even think that’s a correct phrase. The GUM is about measurement uncertainty not about sampling.
“B2.1 measurable quantity”
We’ve been over this before. Either you think an average, such as the global mean anomaly is an attribute of the the phenomenon of the global temperature, or you think it’s a statistic.
Personally I think the definition given could mean you can treat the mean as a measurable quantity and you could then describe the uncertainty of the mean as a measurement uncertainty. But you keep disagreeing, yet still insist we use the GUM.
If you are interested in the individual measurements, then I’m not sure what you want me to discuss about your two definitions. The measurable quality is the temperature at that specific time, and the value is magnitude of that temperature expressed in Celsius or Kelvin.
Why do you keep using examples of sampling things other than physical measurements?
If you want to discuss survey sampling, you need to go to another forum.
Is that an admission you can use SEM for samples that are not all of the same thing? Maybe you should have been clearer that when you said
“If you want to discuss survey sampling, you need to go to another forum.”
This whole article was about how accurate you could be in determining the global average anomaly. That requires understanding sampling.
No. This thread that you are engaging in is about whether 8 significant figures are justified and meaningful.
Using infinitely precise integers to rank subjective customer satisfaction is a poor analogy for measuring irrational numbers to assess the magnitude of temperatures.
That’s not what I was doing. I was suing it to show that Jim was wrong to claim you could only use the SEM when measuring the same thing. I specifically said just before the example
If one is going to claim that multiple measurements can be used to improve the precision of the probable value of a measurand, then the same thing must be measured every time, not some proxy substituting for it, not something that is highly, but imperfectly, auto-correlated, as most air parcels are.
He isn’t the only one who thinks he is right. I’m not going to submit my CV, but I have good reasons to believe that I have a pretty good grasp of the topic, and as a result of these interchanges, I have done far more refresher reading than I thought that I would ever need to do. I’m of the opinion that is is you who doesn’t understand the subject and have closed your mind to the rather explicit requirements for using the various methods of assessing uncertainty. Your arguments have not persuaded me that you understand it better than Jim, Tim, and me.
“If you can’t explain it to a six-year-old, you don’t understand it yourself” ― Richard Feynman
Great comment.
With the information available on the Internet about metrology, there is no excuse for not posting resources covering measurement uncertainty.
I am not interested in discussing customer satisfaction polls or political polls. Those are not physical measurements of measurable quantities.
Clyde, I think you just summarized my own background and thoughts perfectly.
I’d like to remind you that this thread was triggered by the author pointing out examples of the Met Office showing anomalies to 8-significant digits to the right of the decimal point. The essence of the disagreement is whether one is justified in displaying a temperature mean or anomaly to more significant figures than the original measurement. Bringing up other examples of the use of the SEM is actually off topic and only serves as a distracting Red Herring.
My position is that it is not justified because a temperature time-series has several trends, depending on the start and stop time of the series. That is, temperature time-series fail the stationarity requirement that the mean, standard deviation, and correlation structure remain constant. However, single objects don’t fail the test. Yet, you have not even acknowledged that I provided a long comment on that 4-days ago at:
https://wattsupwiththat.com/2025/11/10/the-curious-case-of-the-missing-data/#comment-4130182
Incidentally, my citation elsewhere, [Smirnoff, Michael V., 1961, Measurements for engineering and other surveys, Prentice Hall, p. 29], has an extended discussion on “probable error” and how it varies in proportion to the square root of the number of measurements. For an object, multiple measurements do have the property of stationarity.
“I’d like to remind you that this thread was triggered by the author pointing out examples of the Met Office showing anomalies to 8-significant digits to the right of the decimal point.”
And I responded to that point in my first comment. I was told that wasn’t the point, and as usually happens here the comment section descends into the usual arguments about uncertainty, the SEM, and now the meaning of expected value.
“The essence of the disagreement is whether one is justified in displaying a temperature mean or anomaly to more significant figures than the original measurement.”
And my view is that you definitely can be.
“Bringing up other examples of the use of the SEM is actually off topic and only serves as a distracting Red Herring.”
I’m not sure what “other” uses you are talking about. This article is about the global annual temperature. The uncertainty is going to depend on the SEM.
“That is, temperature time-series fail the stationarity requirement that the mean, standard deviation, and correlation structure remain constant.”
We are talking about individual years. I’m not sure how much individual trends matter.
“Yet, you have not even acknowledged that I provided a long comment on that 4-days ago at”
Do I have to read, let alone comment on every post here? It’s hard enough keeping up with all the comments and insults directed at me.
I was responding to the following statement by you:
“You keep pointing to an example from metrology. This is one application of the SEM – repeatedly measuring the same thing to get a more precise measurement of that thing. Your problem is you can’t accept that this is not the only use, that it applies just as well to sampling different things from a population.”
Again, we are not interested in those “other” uses unless they can be used to explain why they are valid when the first application requires the property of stationarity. The atmospheric temperature is always changing, while a steel ball bearing in a controlled environment never changes.
No whine before its time. [Paul Masson winery advertisement] How convenient that you have time to respond to most of the replies to your comments, even other comments I have made, but suddenly you plead poor to excuse your not responding to something that can destroy your position.
You still have not responded to my criticism that a data set consisting of a time-series does not inherently have the property of stationarity that qualifies it to have the justification for improving the precision for calculating the decadal baseline. For that matter, even a daily or annual average has several trends. Without having a baseline of the same precision as the annual temperatures used to calculate the annual anomaly, the rules of computation require that the annual averages be rounded off to the same precision as the baseline before subtraction. This is all about legitimate, acceptable manipulation of raw data to calculate averages and the derivative anomalies, to the precision commonly quoted
bellman thinks *all* measurements are made of the same thing using the same instrument under the exact same conditions. Thus all measurements are random, Gaussian, and cancel.
He continues to state he doesn’t use that meme all the time but it is obvious that he does.
He simply can’t understand that the SEM is a metric for SAMPLING error and not for measurement uncertainty. It is an ADDITIVE factor to the measurement uncertainty. But by assuming that all measurement uncertainty is random, Gaussian, and cancels he can then use the SEM as the measurement uncertainty.
The SEM does *NOT* allow more precise measurement. It only allows more precisely locating the average of the population. But that only helps if all of the measurement uncertainties are random, Gaussian, and cancel! Once again, it’s the ‘random, Gaussian, and cancels” meme that he denies colors every assertion he makes! Otherwise, lowering the SEM does *NOT* increase the accuracy or the precision of the measurement data.
“bellman thinks *all* measurements are made of the same thing using the same instrument under the exact same conditions.”
Your obsession with me is not remotely healthy. Obviously I don;t believe anything of the sort.
“He continues to state he doesn’t use that meme all the time but it is obvious that he does.”
If it’s so obvious why do you have to jump into every conversation in order to point it out?
“He simply can’t understand that the SEM is a metric for SAMPLING error and not for measurement uncertainty.”
The SEM is a metric for sampling error. That’s what I keep trying to tell you. It describes how uncertain a sample mean is. You can also use it to determine the uncertainty of a measurement, as in measuring the same thing and taking the mean of your measurements. You can call it the SEM or the SDOM or whatever you want, it’s all the same maths.
I’m really not sure what point you think you are making. This thread started with Jim insisting you could only use the SEM when measuring the same thing.
“It is an ADDITIVE factor to the measurement uncertainty.”
I keep trying to explain to you, that in general the uncertainty of a sample mean caused by measurement uncertainty, is in most cases going to very small compared with the uncertainty from the random sampling. And also that you should not be adding any random measurement uncertainty to the SEM as it’s already been accounted for in the standard deviation of your sample.
Now if there is a systematic measuring error, then that does have to be accounted for, along with an y other possible biases in the sampling.
“But by assuming that all measurement uncertainty is random, Gaussian, and cancels he can then use the SEM as the measurement uncertainty.”
You just keep making stuff up.
“The SEM does *NOT* allow more precise measurement.”
The SEM of what? You keep confusing the SEM of the sample and the SEM of individual measurements. If you measure the same thing under the same conditions, you can get a more precise measurement of that thing. That’s the whole point of taking the mean of multiple measurements. If you are taking a sample of different things and only measuring each value once, you cannot make any individual measurement more precise. But you do know that if the uncertainties are random the overall measurement uncertainty will decrease in exactly the same way.
We’ve been over this so many times, e.g. using specific the general propagation rules. But you never seem to be able to put these things in order, and keep claiming I’m saying something different.
“But that only helps if all of the measurement uncertainties are random, Gaussian, and cancel! Once again, it’s the ‘random, Gaussian, and cancels” meme that he denies colors every assertion he makes!”
I really worry about you. It can not be healthy to have to keep repeating the exact same phrase over and over. You’ve used it four times in this one short comment.
The GUM NEVER mentions “random uncertainty”. It only mentions errors that may be random.
The implication is that errors are know and calculated by “true value ± error”. Since the GUM tells you that the true value is never known, the old paradigm of error analysis had to be updated.
As I have already explained, uncertainties are not errors, they are variances/standard deviations and therefore don’t cancel. The values of the probability distribution are included in the variances as positive numbers (due to squaring) and added, never subtracted.
Tim has pointed this out to you multiple times and you fail to understand because of your statistical training; numbers are just numbers. Measurements are different. If your observations are all in the units place, then the mean and variance should also be in the units place. Any samples you take will use numbers with units place and therefore the mean and variance of each sample should also be in the units place. Similarly, the sample means distribution will be made up of numbers which have only a units place. The mean and variance of the sample means distribution will also be calculated from numbers in the units place.
In other words, at each and every point in the calculations, the data will remain with the original resolution that the measurement was made with. Consequently you can not increase the value of what was measured.
You continually fail to address the ramifications of this assertion. Tell us why every physical constant is not known to the 10⁻¹⁰⁰ or 10⁻¹⁰⁰⁰ decimal place? With todays computer controlled measurement systems, doing a million or more measurements is child’s play. If I can divide by the √10¹⁰⁰⁰ there is no reason to use that number as the constant.
“The GUM NEVER mentions “random uncertainty”.”
True, but they use terms like random variable and independent which have the same meaning. It’s the independence that makes an uncertainty random.
“Since the GUM tells you that the true value is never known, the old paradigm of error analysis had to be updated.”
As they note, that update makes no difference to the equation. The general equation for propagating uncertainties is the same as the general equation for propagating errors.
“As I have already explained, uncertainties are not errors, they are variances/standard deviations and therefore don’t cancel.”
Correct, apart form the fact that they cancel in exactly the same way as those based on error propagation.
“The values of the probability distribution are included in the variances as positive numbers (due to squaring)”
Uncertainties are always positive, squaring has nothing to do with it. It’s the partial derivatives that may be negative until squared.
“and added, never subtracted.”
For independent inputs, that’s correct. But when there is correlation you need to use equation 13. The component for the covariance is not squared. If the correlation is negative, or if the function involves subtraction, that that value will be subtracted from the uncertainty.
Then see Note 1. If the correlation is +1, the equation reduces to a sum of unsquared values, each multiplied by the partial derivative of the function. If you are adding the variables this would give you the sum of all the uncertainties. But if you are subtracting a value, it’s partial derivative is -1. The result is that it’s uncertainty is subtracted. x1 – x2, would have uncertainty u(x1) – u(x2).
“Tell us why every physical constant is not known to the 10⁻¹⁰⁰ or 10⁻¹⁰⁰⁰ decimal place? With todays computer controlled measurement systems, doing a million or more measurements is child’s play.”
For exactly the reason Bevington spells out. You cannot make that number of measurements, and the more precise your answer gets, the more effect tiny systematic errors will have.
As so often you seem to have no concept of the value of numbers here. You ask about measuring anything to 1000 decimal places. First, that’s a meaningless value. Why would you want any physical constant known in that detail? But more importantly, a million or more measurements are just not going to help. The square root of a million is a thousand. Even in an ideal world making a million measurements would increase the precision of the result by 3 decimal places. If you were measuring in units, to get to 10⁻¹⁰⁰⁰ would require 10²⁰⁰⁰ measurements.
He has of course seen this inconvenient truth illuminated many times in the past, yet to this day refuses to acknowledge the implications that uncertainty always increases.
The price of the truth is too high for him as it would force him to turn loose of his “averaging increases resolution” line.
“Again, we are not interested in those “other” uses”
I’m very interested in them, given the comment I was responding to was claiming that you could only use the SEM when measuring the same thing.
“unless they can be used to explain why they are valid when the first application requires the property of stationarity.”
What does stationarity have to do with the question of a sample mean? This feels like a red-herring. The main problem with non-stationarity in sampling would be if it introduces a bias in your sample. But that’s an issue of the nature of the sampling. As long as your sampling procedure is good that shouldn’t be an issue. In the case of global temperatures, this is why you use area weighting and anomalies, for instance.
“The atmospheric temperature is always changing, while a steel ball bearing in a controlled environment never changes.”
Which is why you want to monitor it. Any temperature record is over a specific time frame. Say a month, or a year. It doesn’t matter that the temperature changes over that period, because your average is the average of all temperature over that period. It’s no different to the concept that when you take a random sample of, say, peoples heights, all the heights will be different. There isn’t a difference between averaging many different sized things, averaging the same thing that is changing over time, or combining the two.
“No whine before its time“.
Does this patronizing insult ever get old? You were the one complaining that I hadn’t responded to one of you comments. A comment not even addressed to me.
“You still have not responded to my criticism that a data set consisting of a time-series does not inherently have the property of stationarity that qualifies it to have the justification for improving the precision for calculating the decadal baseline.”
Because I’ve no idea what you think that means.
As I keep saying, the precision of the base value is not really an issue, if all you want to know is if one vale is bigger than another with like for like comparisons. It’s just a constant and makes no difference how accurate or precise it is. As to the non-stationarity over a 30 year period, I doubt it makes much difference. The only issue is that with a trend over that period the standard deviation will be spuriously increased. That could give you a spuriously larger uncertainty. But I doubt it would make much difference, given the size of the year to year variability compared with the trend. If it is a problem you can easily de-trend the series.
“Without having a baseline of the same precision as the annual temperatures used to calculate the annual anomaly, the rules of computation require that the annual averages be rounded off to the same precision as the baseline before subtraction.”
If you mean the so-called rules for significant figures, as I’ve said before, I don’t think they make any sense when you are also using proper uncertainty analysis. Just work out the uncertainty of the anomaly and round accordingly. (or don’t round for all I care, if it’s just data).
A lot of ‘word salad’ to dance around actually critically responding to my claim that to justify increasing the precision (reducing the uncertainty) of the estimate of the value of a measurand (such as the average monthly temperature and its uncertainty), the true value cannot be changing.
Another way of putting it is that one cannot narrow the estimate of the uncertainty of the position of a moving target by following all of the positions. The longer the measuring process takes, the larger the variance will be. That is the essence of the stationarity requirement, and hence Jim’s list in this thread of the requirements that must be met to justify claiming precision can be improved by taking an increased number of measurements.
Don’t expect any further responses to this thread. I think I have said everything I can to convince you that your position, that the uncertainty of the mean value of some characteristic of a past, current and future population can be reduced by simply taking more measurements, is wrong. If there is no limit to the future size of a growing population, how can one possibly narrow the uncertainty as the mean increases?
“the true value cannot be changing.”
The “true” value isn’t changing. That value is the mean over the specified period.
“The longer the measuring process takes, the larger the variance will be.”
As I said, that’s true to a very small extent, but Amy increase is small compared with the point to point variance.
“That is the essence of the stationarity requirement,”
You keep claiming “requirements” without evidence. At best these are assumptions. If a time series has a trend, then the assumption of iid is wrong. But that doesn’t mean you cannot get a reasonable estimate by assuming thee is no trend. In most cases it makes no significant difference. If you are worried about it, it’s easy enough to detrend the data in order to reduce the uncertainty.
“If there is no limit to the future size of a growing population, how can one possibly narrow the uncertainty as the mean increases?”
I suspect your problem is you are looking at the mean as an ongoing process. That’s when non-stationarity is a problem. When you just keep adding each new set if data into an ever expanding tome period. But that’s not what anybody is doing. You are generating averages for fixed periods, months, years or longer – each the same length.
You just can’t help yourself can you? You just keep equating error and uncertainty.
When are you going to establish that you know the difference between the two paradigms?
Why don’t you show us what references you are using to delineate the differences.
A true value is not a mean, mode, or median!
A true value is an a priori value that you know. It can be used to calculate error.
The GUM says;
Maybe you have a different definition of what a definition is.
So adding more monthly averages to a monthly series is not an ever expanding time period. How about adding more year averages to a yearly series.
FYI, a time series has any number of equal time steps.
“You just can’t help yourself can you? You just keep equating error and uncertainty.”
I was replying to Clyde who used the term true value. I specifically put “true” in inverted commas. I could have just said value, but that would not have been clear.
Nothing in my comment said anything about error. It makes no difference which model if uncertainty you use. You are just looking for excuses to avoid the point.
If you think using the GUM model of uncertai Tu makes any difference to my argument, you need to explain what that is. Which will be fun, given we weren’t even talking about measurement uncertainty.
“Which will be fun, given we weren’t even talking about measurement uncertainty.”
Exactly what uncertainty were we talking about then? The SEM?
For the umpteenth time, the SEM is SAMPLING error. Are you claiming we were talking about *sampling* error instead of measurement uncertainty?
“Exactly what uncertainty were we talking about then? The SEM?”
See the original comment I was responding to
https://wattsupwiththat.com/2025/11/10/the-curious-case-of-the-missing-data/#comment-4130045
“But that doesn’t mean you cannot get a reasonable estimate by assuming thee is no trend.”
The words of a fortune teller that sees all and knows all!
In Tim’s world view mathematics is no different than fortune telling.
The standard error of the mean only defines the value at the CENTER of an interval containing possible measured values. It does not define standard uncertainty, i.e., measurement uncertainty.
The SEM also has to be derived when meeting multiple requirements I have already posted. Measuring the SAME thing with the same device and procedure is a must.
“The standard error of the mean only defines the value at the CENTER of an interval containing possible measured values.”
It defines (or estimates) the standard deviation of the sampling distribution. If that distribution is symmetrical then the mean will be at the centre of a confidence interval defined by the SEM.
“It does not define standard uncertainty, i.e., measurement uncertainty.”
We want to know how much confidence we can have in our sample mean. The bigger the SEM the bigger the confidence interval. If you think of the sample as a measurement of the mean then you can call that confidence interval the measurement uncertainty, ir you can just call it the confidence interval. Regardless. It’s the value you want if you want to know how much uncertainty there is in your estimate.
“sampling distribution. If that distribution is symmetrical then the mean will be at the centre of a confidence interval defined by the SEM.”
This is your statistician’s bias showing again! We’ve been over this many times. You simply CANNOT assume a symmetrical distribution in the real world!
In a real world distribution that is *not* symmetrical the mean of the samples can be modulated by the mide. That implies that the average is *not* the best estimate if the expected value (i.e. the best estimate” for the measurement.
When are you going to abandon your statistician biases and join us in the real world?
“This is your statistician’s bias showing again!”
Yes, I’m biased towards getting things correct rather than spouting nonsense. I know that’s out of fashion at the moment.
“You simply CANNOT assume a symmetrical distribution in the real world!”
Which is why I said “if”. I was correction the claim that the SEM was defining the centre of a confidence interval. That was Jim’s a assumption of symmetry, not mine.
“In a real world distribution that is *not* symmetrical the mean of the samples can be modulated by the mide.”
You keep throwing out nonsense like this, without ever explaining what you mean. The mean is the mean. It isn’t “modulated” by anything other than the distribution. It can be different to the mode or the median if that’s what you mean, but it makes no sense to say they modulate it.
“That implies that the average is *not* the best estimate if the expected value”
That’s literally what “expected value” means. But as always you just guess what these words mean, and then complain when those who use the terms don’t agree with your fantasies.
“When are you going to abandon your statistician biases and join us in the real world?”
When are you going to understand that statistics describe the real world?
When are you going to get past your juvenile understanding of statistics, and realise that they are meaningless when the data is missing or totally corrupted. !
You may have learnt to “do” basic statistics… but you have obviously totally failed to understand what you are actually doing.
“When are you going to understand that statistics describe the real world?”
No, physics, chemistry, biology, etc ATTEMPT to describe the real world.using the language of mathematics. Statistics, tries to interpret/dumb down the complexity and variability of those mathematics to a point that the limited perceptions of humans can interpret them. . Some times it works, frequently it doesn’t.
there are lies, damn lies, and statistics.
“Yes, I’m biased towards getting things correct rather than spouting nonsense. I know that’s out of fashion at the moment.”
getting things correct means those things adequately represent reality, REAL world reality, not reality in statistical world where assumptions like “all measurement uncertainty is random, Gaussian, and cancels” are made.
“Which is why I said “if”.”
You always want to force the discussion into very specific instances which have nothing to do with reality, they only exist in your statistical world. “if” is meaningless when applied to the measurement of journal sizes in an engine – “if the wear on each cylinder is the same” just doesn’t cut it in the real world!
“I was correction the claim that the SEM was defining the centre of a confidence interval.”
You corrected nothing in reality. Even in an asymmetric distribution the SEM *still* defines the center of a confidence interval. The SEM is the standard deviation of the sample means. The mean of those sample means is the best estimate of a Gaussian distribution if the sample sizes are large enough to invoke the Central Limit Theory and the LLN.
You *can* get a Gaussian distribution from the means of multiple samples of the population even if the population is an asymmetric distribution. Whether or not that mean from the multiple sample means is accurate or not and whether the SD of the sample means is accurate is not guaranteed if generated from an asymmetric parent.
“You keep throwing out nonsense like this, without ever explaining what you mean.”
I’ve explained this to you THREE SEPARATE TIMES. As usual, it just never sinks in. In an asymmetric distribution the MODE is the value that occurs most often. That means in any sample it is probable that the MODE appears in the sample more often than the actual average of the population data. Thus the mode modulates the calculated mean of the sample means away from the actual average value and toward the mode value!
I know you are going to reply that it is an insignificant bias but that is just a non-justified assumption – especially when you are trying to find an average way out on the resolution limits. For temperatures where you are trying to identify measurements out to the hundredths digit any modulation of the average by the mode *will* show up. E.g. the nighttime exponential decay in temperature is an inherent asymmetric distribution yet climate science (AND YOU) assume that the daily temperature profile is perfectly symmetric so that the mid-point temperature *is* an accurate “average”.
“but it makes no sense to say they modulate it.”
You can’t even understand basic English apparently. from the freedictionary site: “1. To regulate or adjust to a certain degree”
The mode value *will* “regulate” or “adjust” the value of the population mean calculated from multiple samples because it is the value that appears the most often.
“hat’s literally what “expected value” means. But as always you just guess what these words mean, and then complain when those who use the terms don’t agree with your fantasies.”
What a load of crap! In an asymmetric distribution value expected from a single draw from the parent distribution, i.e. the EXPECTED VALUE of the parent distribution, is the MODE, not the average. The mean of the asymmetric distribution is *NOT* the expected value, it is merely the “balance” point of the distribution. In an asymmetric distribution the mean of the sample means tends toward the “balance” point of the asymmetric distribution, *NOT* to the expected value of the asymmetric parent distribution.
“When are you going to understand that statistics describe the real world?”
The real world doesn’t consider the expected value of an asymmetric distribution to be the mean – but you and statistical world apparently do. The real world doesn’t believe all measurement uncertainty is random, Gaussian, and cancels, but you and statistical world apparently do. The real world doesn’t believe that “numbers is just numbers”, but you and statistical world apparently do. The real world doesn’t believe the average of a multimodal distribution provides any knowledge of the individual modes, but you and statistical world apparently do. The real world doesn’t believe the mid-point of an asymmetric distribution is the average value, but apparently you and statistical world do.
Statistical descriptors describe the data – but that description *has* to be understood in real world terms, not in “blackboard” terms.
“getting things correct means those things adequately represent reality”
Well, yes. If statistics don’t adequately represent reality, then they are bad statistics. That’s why I said it’s important to understand the statistics and not engage in wish fulfillment.
“not reality in statistical world where assumptions like “all measurement uncertainty is random, Gaussian, and cancels” are made.”
Thanks for demonstrating there no point taking anything else you say in this lengthy post seriously. If you really think that statisticians think that all distributions are Gaussian etc, then you are just demonstrating your ignorance of the subject.
It’s so ironic that these personal attacks are in response to me pointing out that you couldn’t assume a distribution was symmetric.
This bit looks a bit more interesting, once you ignore all the personal abuse.
“In an asymmetric distribution the MODE is the value that occurs most often.”
Correct. It’s also correct for a symmetric distribution. It’s true by definition.
“That means in any sample it is probable that the MODE appears in the sample more often than the actual average of the population data.”
Correct.
“Thus the mode modulates the calculated mean of the sample means away from the actual average value and toward the mode value!”
And there’s your problem. And we’ve been over this before, so there’s no excuse. On average the average value of a sample will be the mean. Yes in any given sample you are more likely to have more values closer to the mode, but that’s modulated by the mean. Say the mean is a lot larger than the mode, you will have samples with more small values, but that is compensated for by the number of larger values. With small sample size you will get a skewed sampling distribution, which means there will be a higher chance of getting an individual sample mean that is below the mean, but you will also get samples with a mean much larger than the population mean. The mean of all samples will be the same as the population mean.
“I know you are going to reply that it is an insignificant bias…”
No. I’m arguing there is no bias at all.
“The mode value *will* “regulate” or “adjust” the value of the population mean”
How on earth can the mode change the population mean? The population mean is the population mean. If the population is skewed the population mean is still the population mean.
“What a load of crap! In an asymmetric distribution value expected from a single draw from the parent distribution, i.e. the EXPECTED VALUE of the parent distribution, is the MODE, not the average.”
That is not what expected value means. I’ve explained this to you several times. You refuse to accept the actual mathematical definition of expected value, but you haven’t supplied any reference supporting your definition. Here are a couple of easily found references on expected value.
https://www.geeksforgeeks.org/maths/expected-value/
https://statisticsbyjim.com/probability/expected-value/
https://brilliant.org/wiki/expected-value/
Also the GUM:
Now, my guess is that you are going to whine about how none of these count as they are taking about statistics. But that’s the problem. This is a statistical term, it is used in statistics, and that’s how it’s defined in statistics. If you want an alternative “real world” definition, then you need to show what the definition in your supposed real is.
You might point out that Taylor uses it in a different way, but then he’s only using to mean the value you were expecting, not in the way you are suggesting.
“Well, yes. If statistics don’t adequately represent reality”
Statistical descriptors do *NOT* represent reality. They represent the characteristics of the data used to derive statistical descriptors. It is the data that does or does not represent reality.
I have yet to see anything to disabuse me of this belief. See your recent reference from a university about random uncertainty being Gaussian!
But the mean in an asymmetric distribution is nothing more than a balance point, it is *NOT* the expected value for the next draw from the population!
That does *NOT* imply that the average of an asymmetric distribution will be the *best estimate* for the value of the measurand!
Your lack of reading comprehension skills is showing again. I didn’t say that. I said it will modulate the value of the mean of the sample means. If the mode shows up more often in the samples, then it will bias the mean of each sample away from the actual average of the population. This causes an INCREASE in the uncertainty of the mean of the sample distribution.
“That is not what expected value means.”
It is *exactly* what the term “expected value” implies. Otherwise if the mode is 10 and the mean is 5 — YOU would place your next bet on the 5 space instead of the 10 space. Most people would EXPECT the next value to be more likely to be 10 and not 5! Casinos must love to see you!
Almost all statistical descriptions of “expected value” employ circular reasoning. They basically say the “Expected value (of the average) is the average over the long run”. See the definition from Gemini: “The Mean represents the balance point (the long-run average)” Well, duh! That’s the same as saying “the average is the average”! That’s the problem with statistical world and the real world. This has absolutely *NOTHING* to do with sampling error and the pulling of the mean of the sample distribution away from the population mean by the mode.
The fact that you can’t distinguish the two shows just how much you live in statistical world! This actually gets back to your assertion that a single sample can adequately represent an asymmetric population. If you take enough samples then extreme values in some of the samples will balance out the multiple appearances of the mode in other samples. BUT, for a single sample (or even a small number of samples), the LLN and the CLT simply do not apply and the mode *WILL* modulate the mean of the single sample. How many samples are required to balance depends on the skewness of the parent distribution. I can’t seem to find any consistent method to use in determining this.
With all due respect to the site, this is *only* true for a symmetric distribution. The MODE is a measure of central tendency in an asymmetric distribution since it is the most frequent value. The value of the results in an asymmetric distribution will tend to the most frequent value and *NOT* to the mean. If this were not the case then people would always bet the average value and not the most frequent value.
You *NEVER* bother to list out the assumptions involved in anything you assert. See the reference from statisticsbyjim. The unstated assumption there is that the distribution is symmetric. It simply doesn’t apply to the general case unless, like you, it is assumed that all distributions are Gaussian.
For measurement uncertainty, the experimental standard deviation of the mean is useful in one and only one instance. That is when a single item is being measured in repeatable conditions.
It is why you can’t measure a single brake rotor multiple times under repeatable conditions and then assume all the others are the same. The experimental standard deviation of the mean is only applicable to that one item.
Do you understand what six sigma means to an engineer? If I design a circuit to meet requirements when components can vary to six sigma values, then I can be certain that only under very extreme conditions will the circuit fail.
I could care less how accurately a single component has its mean determined, I want to know the dispersion of values, i.e., the standard deviation.
“For measurement uncertainty, the experimental standard deviation of the mean is useful in one and only one instance.”
Why do you think this is about measurement uncertainty? You insist that an average is not a measurement, yet you want to talk about it’s measurement uncertainty.
An average of observations is merely an estimate of a measurement.
The GUM says this about uncertainty.
“An average of observations is merely an estimate of a measurement.”
I’ll ask again, do you consider the mean of a sample as a measurement?
Read GUM Section 4.2.1.
Look up GUM B2.15 and see if you can decipher what it means.
Does this mean anything to you?
It would seem you either believe you know better or you don’t understand what this is saying.
Then read GUM F1.1.2 and tellus what you think the measurement of a property entails.
No answer I see. It really shouldn’t be difficult, either you think the mean of a population of different things is a measurand or it isn’t.
“Read GUM Section 4.2.1.”
Why? Is it going to say anything different to the last few hundred times you’ve cut and pasted it? I keep saying that if you want to demonstrate you understand what’s being discussed, you need to explain what you think it says in your own words.
“Look up GUM B2.15 and see if you can decipher what it means.”
Certainly. It’s the VIM definition of the repeatability of a measurement result. It describes how closely two measurements of the same measureand should be when carried out under the same conditions. Conditions include the same procedure, observer, instrument, location and within a short time span.
Do you want any other parts explained to you?
“Then read GUM F1.1.2”
I’ve already explained that passage to you multiple times. You still don’t seem to understand it.
Brilliant deflection.
It is up to you to either agree with the GUM or not.
If you agree, then you know that observations made under the same conditions of measurement are required for determining a mean of a random variable. See GUM B.2.15.
If you do not agree, then it is up to you to show references that support your conclusion of 4.2.1fbeing incorrect.
I mostly agree with the GUM. The problem is, I don’t think you understand what it’s saying. You could demonstrate you understand it by explaining it on your own words, rather than just cutting and pasting large globs if text.
“If you agree, then you know that observations made under the same conditions of measurement are required for determining a mean of a random variable”
You keep jumping from the abstract, a random variable, to specifics of measuring a single thing. But claiming you need repeatability is just wrong. What you want is for all values to be iid, and represent the population. If you are measuring one thing then repeatability is useful as it gives you an identical distribution. But it is hardly necessary. You can measure the same thing with different instruments. Each having different uncertainties and still get a good estimate of it’s size. Bevington and Taylor both give examples of that and how to weigh the different observations according to their uncertainties.
However. If your random variable is a population of different things, it’s impossible to measure all the different things under repeatability conditions. Simply because they are different things. It doesn’t matter as they are still all taken from the same distribution. If you don’t think it’s possible to use statistics to estimate the mean of a population, then you are the one who needs to explain why every statistical text book, going back over a century, is wrong. Why every statistician is wrong. Why every scientist who uses hypothesis testing is wrong.
“See GUM B.2.15.”
I already pointed out what that says. It’s just the definition of repeatability of measurements. It doesn’t say anything about it being required.
“If you do not agree, then it is up to you to show references that support your conclusion of 4.2.1fbeing incorrect.”
You need to say why you think I’m saying it’s wrong. I agree with it. In many cases the best way to determine the random uncertainty of a measurement is through repeated measurements under the the same conditions. The SD of the measurements is a type A uncertainty. You also have to go to the next paragraph to understand that when you do that the mean of all your measurements will be a more certain value, and it’s uncertainty is the sample SD divided by root N.
The only caviates I would say are that you need to do a reasonable number if measurements. If you only have a few measurements there will be a lot of uncertainty in the SD, and it may be better to use an estimate of the uncertainty, a type B uncertainty.
All of this also applies to sampling.
The other aspect that applies more to measurement is that you need the variance to be stronger than the resolution. Otherwise you could measure the same things 100 times and get exactly the same result, as all values are being rounded to the same low resolution value. You definitely don’t want to assume the uncertainty is zero in that case.
“in other words, at a low order of precision no increase in accuracy will result from repeated measurements.”
[Smirnoff, Michael V., 1961, Measurements for engineering and other surveys, Prentice Hall, p. 29]
This directly calls into question the practice of averaging many temperature readings that only have a resolution of 1 degree, and then claiming that the average temperature is known to several significant figures, which is the essence of this exchange.
I use this in an essay I am working on discussing precision and resolution.
If I have a digital voltmeter with one decimal digit and a voltage reference of 1.02V what occurs?
I will read 1.0V each and every time I make a measurement. This will appear to be a highly precise reading because it always gives the same value.
I also know the resolution uncertainty is ±0.05V or an interval of 0.95V to 1.05V.
Does the value lie within the interval? Yes.
Do I know exactly what the value is? No, and I’ll never know regardless of how many measurements I take. Dividing by the √n will not increase the resolution at which the measurements were taken.
“This directly calls into question the practice of averaging many temperature readings that only have a resolution of 1 degree, and then claiming that the average temperature is known to several significant figures, which is the essence of this exchange.”
No. The point is that when temperatures vary by much more than 1 degree throughout the Earth, the resolution of 1 degree becomes largely irrelevant.
I know people here would prefer to not believe that, but it’s easy to demonstrate using random values, or real world data, or simply through he application of statistics. A 1 degree resolution is just a random measurement uncertainty of half a degree. Propagate that through thousands of measurements at it becomes much smaller, especially compared with the variance of the temperatures.
That unsupported assertion does not make sense. The crux of the disagreement is why alarmist climatologists feel an imperative to improve the precision of the estimate of the mean and then you come along and tell us “the resolution of 1 degree becomes largely irrelevant.”
“That unsupported assertion does not make sense.”
It’s an assertion I’ve supported many times in the past. And to me it makes perfect sense. The simplest support I can offer is to look at it using random numbers. Generate a large number of figures from a known continuous distribution. Round them all to the nearest integer and then see how that effects the sample mean.
I’ve done this many times using both random numbers and real world figures, and the result is always the same. As long as there is enough variation in the data, and a large enough sample, the results from the rounded data is virtually the same as the results from the raw data.
Virtually? You are admitting that there is a difference. That is the crux of the problem, which is, how much is the estimate of a mean affected by the way numbers are handled, and what is the best approach for getting the correct result?
Of course there’s a difference. You are using different values. The point is those differences are negligible compared to the overall uncertainty.
I’ll have to do my tests, but from memory it was something like taking the average of all the CRN daily values, reported to 1 decimal place. Then taking the average having rounded all daily values to the nearest degree. With around 30,000 daily values. The difference between the two was less than a hundredth a degree.
Even if I rounded the values to the tens of degree figit. I could still get an average wihinh about a tenth of a degree of the first average.
Measurements have two components, the estimated value and the measurement uncertainty associated with that estimated value.
The issue is *NOT* how close estimated values arrived at using different methods are but how ACCURATE each is.
As usual, you just focus on the estimated value and ignore the measurement uncertainty — JUST LIKE CLIMATE SCIENCE does!
No. We were talking about how much resolution effects the average.
A 1 degree resolution is just a random measurement uncertainty of half a degree.
Same old stuff, second verse. You are so off base you are unsavable. The “random error” will only cancel out if one assumes that all temperatures are a true value. That is an extremely old meme that is no longer supported under the GUM (since about 1980). The GUM doesn’t even mention the term “random uncertainty”.
Your personal meme and that of climate science is to reduce supposed inaccuracies to their smallest values regardless of how. Skipping accepted scientific protocols to obtain an answer that you like has become second nature.
Look at Eq. 10 in the GUM, do you see that Σ, that means add. Do you see the squared terms? That means there are no negative terms. You have all squared uncertainties added. How do you cancel “random uncertainties” if they are only positive and added?
Give it up dude.
“A 1 degree resolution is just a random measurement uncertainty of half a degree.”
Yes, that’s my point. It’s effectively a random measurement uncertainty, as long as the variance is bigger than the resolution. And as such, averaging a large number of such values reduces that uncertainty. Just as it does any other random uncertainty.
“The “random error” will only cancel out if one assumes that all temperatures are a true value.”
Not sure what you mean there. Temperature is a true value. The temperature reading is not. It’s not going to be the true value as it’s rounded to the nearest unit.
“The GUM doesn’t even mention the term “random uncertainty”.”
So? Call it what you want, the calculations are the same.
“Look at Eq. 10 in the GUM, do you see that Σ, that means add.”
Please. Your patronizing tone is so unbecoming. I’ve been explaining equation 10 to you two for years, and you still don’t get how it works.
“How do you cancel “random uncertainties” if they are only positive and added?”
One day you are going to actually define what you mean by cancel. Uncertainties, when you are adding or subtracting values cancel in the sense that they don’t simply add. The uncertainty of x + y is not u(x) + u(y), it’s √[u(x)² + u(y)²], which is inevitably smaller. That’s because there is likely to be some cancellation. It’s why variances add when you add or subtract independent random variables. It’s why whenever you add any number of random numbers you are more likely to be close to the average than to the extremes, and why when adding iid random variables the standard deviation is given by √nσ, and when taking their average the deviation is σ/√n.
Sqrt[u(x)^2+u(y)^2] < [u(x)+u(y)] does *not* mean u(x) – u(y) = 0
You are just shoving together two different points that have nothing to do with each other.
Whether you add measurement uncertainty directly or in quadrature is to be determined by the person doing the measurements. Adding directly is typically the worst case – highly desirable in a situation affecting civil or criminal liability – and adding in quadrature is the best case – highly desirable in a situation that can withstand large tolerance variance.
They ADD in both cases!
They add because you use VARIANCE, a positive number. There is *NO* cancellation possible. It’s always u1 + u2 + … + un. There is *NO* -u1, -u2, …, -un to provide cancellation.
“When you subtract dependent uncertainties with a correlation of 1, you get the equation on the right. “
You don’t SUBTRACT uncertainties! Where do you see a minus sign in √[u(x)² + u(y)²?
If there is possible cancellation of measurement uncertainties then you ADD them in quadrature – BUT YOU STILL ADD THEM, YOU DO *NOT* SUBTRACT THEM!
From GUM Eq 13, it’s all ADDITION!
(2) * Σ(i=1 to N)Σ(j=i+1 to N) (∂f/∂x_i)(∂f/∂x_j) * u(x_i,x_j)
The GUM also says about this:
” Equation (10) and those derived from it such as Equations (11a) and (12) are valid only if the input quantities X are independent or uncorrelated (the random variables, not the physical quantities that are
assumed to be invariants — see 4.1.1, Note 1). If some of the Xi are significantly correlated, the correlations must be taken into account.”
Temperature measurements measuring different things using different instruments under different conditions are *very* unlikely to be significantly correlated.
As usual, you are trying to argue a *very* specific instance is actually the general case – and *STILL* getting the very specific case wrong.
“They ADD in both cases!”
Except in the case I keep mentioning. This is your “cherry-picking” problem. You have s generally true statement, but fail to understand the maths behind it, and why there can be an exception to that rule.
I’ll explain this one more time. Uncertainties add because variances add when adding or subtracting independent random variables. If the variables are not independent you have to factor in an additional term for the covariance. This is the case when there is an unknown systematic error.
The result if this is that when you add two non independent variables the variance is greater than the sum, if the correlation is positive. But the variance is less if the correlation is negative. When you are subtracting the reverse is true. Subtracting two random variables with a positive correlation will result in a term being subtracted from the variance. This is the exception to your rule that uncertainties always add.
In the case where the correlation is +1, then adding variables will result in the variances being directly added. But when you subtract the variables the variance will be subtracted. And if the variances are the same they will completely cancel.
All of this can be seen in the GUM equation 13.
You mention equation 13, but still don’t get it. The subtraction comes from the fact that when you are subtracting a value its partial derivative is -1.
You even quote the part where the GUM explains that equation 10 only applys to independent values. Yet don’t seem to understand the implication.
“Temperature measurements measuring different things using different instruments under different conditions are *very* unlikely to be significantly correlated.”
Which was the point I made some time ago. But in this case we are talking about two measurements with the same systematic error. Hence they are correlated. This might happen because they are both being made with the same I strument, or because there are environmental factors.
“As usual, you are trying to argue a *very* specific instance is actually the general case”
I’m not saying it’s the general case. I was specifically responding to a claim that uncertainties always add even when you have the same systematic error. And this is directly applicable to the case of an anomaly, where you are subtracting a base value from a specific reading. If both are being made with the same instrument in the same location and similar circumstances, then there is a likiehood that some systemat factors will be common to both.
“Except in the case I keep mentioning.:
You mean that special case you keep trumpeting as being of GENERAL applicability? The special case where all measurement uncertainty is random, Gaussian, and cancels?
How do you *KNOW* it totally cancels? You can assume PARTIAL cancellation which is where the root-sum-square method of ADDING measurement uncertainties comes from.
There is a *reason* why metrology defines the interval of reasonable values as the measurement uncertainty. If *all* of the measurement uncertainty cancels then the SD of the parent distribution would be ZERO, not the SEM, but the SD of the parent distribution since there would be no interval of reasonable values that could be assigned to the measurand.
“But the variance is less if the correlation is negative.”
from the GUM:
“where r(xi, xj) = r(xj, xi), and −1 u r(xi, xj) u +1. If the estimates xi and xj are independent, r(xi , xj ) = 0, and a change in one does not imply an expected change in the other.”
If the measurements are *NOT* independent but are negatively correlated then that leads to the assumption that there is a common factor in all of the measurements – e.g. the calibration offset is the SAME for each and every measurement instrument used and is involved in each and every measurement.
This is all part of your special case where you have multiple measurements of the same thing using the same device under the same conditions. I.e. all measurement uncertainty is random, Gaussian, and cancels. You just can’t get away from that meme, can you?
If you have single measurements of different things using different devices under different conditions then the probability that the correlation between measurements is 0 (zero) is the most likely outcome.
And *that* is the general case in the real world of temperature measurement. It’s the general case in the *real* world. If you take six measurements of each journal in an engine the results of the six measurements are NOT correlated between journals since you are measuring different things and each of those things will have different wear characteristics. The six measurements of each journal *may* be correlated but how correlated they are depends on the measurement protocol being exactly the same for each of the six measurements, e.g. the force applied to the measurand by the anvil head of the micrometer has to be EXACTLY the same or the correlation will *NOT* be 1 even for the measurement of the same thing! You will *NOT* get total cancellation of measurement uncertainty.
You keep wanting to force your special case down everyone’s throat when it is just not applicable in the real world.
“You mean that special case you keep trumpeting as being of GENERAL applicability? The special case where all measurement uncertainty is random, Gaussian, and cancels?”
I think I’ll leave this here. It’s clear Tim is either trolling or suffering from some for of dementia, and either way it’s not worth continuing this discussion. I’ve explained multiple times that this is not about random uncertainties, but about a systematic error.
In other words you have absolutely nothing to offer and are looking for an easy way out!
It’s about your assertion that ALL SYSTEMATIC ERROR IS EQUAL AND CANCELS.
It was wrong from the beginning and its wrong now.
The force applied by the anvil heads to the measurand *IS* systematic uncertainty. And it changes for each measurement. Therefore it can *never* completely cancel and it, therefore, winds up adding to the total measurement uncertainty.
The only one here with dementia is you and it stems from you living in a blackboard statistical world where you can ASSume all kinds of things that don’t apply in the real world.
All about a deflection. Having been caught out in the lie that I think all uncertainties are random, Tim shifts to a completely different lie, that I think all systematic error is identical. And he won’t ever acknowledge that these two lies are contradictions.
You didn’t even read what I said.
Where in the GUM is “random uncertainty” mentioned? Show us a copy and paste of that term from the GUM.
If you can’t find it, then using it automatically means you are making an incorrect assertion.
You are again emphasizing your predilection of confusing errors (true value based) with uncertainty (probability based).
Put some numbers in your expression and show us how you get a smaller number.
Here is my example. u(x) = ±0.5 and u(y) = ±0.3
√((0.5)² + (0.3)²) = √(0.25 + 0.09) = √(0.34) = 0.58
Pick any amount of numbers and any value of standard deviations, even negative values, and show what you get.
Using RSS does reduce that perceived uncertainty as compared to the possible full uncertainty. The point is that the values are still added and are larger than any single component uncertainty. THEY NEVER SUBTRACT AND CANCEL. You are creating a COMBINED UNCERTAINTY, that is, a sum of the component uncertainties.
The equation for σ is “SEM • √n”. The GUM defines the standard deviation σ as the standard uncertainty.
You refuse to acknowledge that under specific assumptions, the standard uncertainty of the mean can indicate a smaller interval where the mean may lie, but that the standard deviation defines the dispersion of observations of measurements, that is measurement uncertainty.
Give us all a break and recite the GUM entries that are supporting your assertions of measurement uncertainty. Your use of terms and concepts that are neither in the GUM or are opposite of what the GUM says, has grown tiresome to say the least.
Multiple times bellman has come out into the open and declared that doesn’t agree with everything in the GUM. He never states the details of these disagreements, probably because doing so would further expose his incompetence.
Amen!
“You didn’t even read what I said.”
I though I was quoting you. But looking back it seems you were quoting me, without quotation marks. I should have guessed as the sentence made sense.
“Where in the GUM is “random uncertainty” mentioned?”
Do you think if the GUM doesn’t use a term it ceased to exist? I said before I use the term to indicate uncertainty caused by random factors independent effects. This is to make a distinction from uncertainties caused by systematic errors.
“If you can’t find it, then using it automatically means you are making an incorrect assertion.”
You do like your arguments from authority. The GUM is the source of all truth. Any thing not in the GUM is evil and must be destroyed.
“You are again emphasizing your predilection of confusing errors (true value based) with uncertainty (probability based).”
You really don’t know what you are arguing against or for. Error based uncertainty is probability based, just as the GUM’s paradigm.
The main distinction is that error is implying a frequentest paradigm of probability. Whereas the GUM is using a Baynsian one, though they avoid stating that explicitly.
“Put some numbers in your expression and show us how you get a smaller number.”
It’s smaller than the sum of the two uncertainties. That’s because if cancellation. If there was no cancellation the uncertainty would just be the sum. Sorry if that wasn’t clear.
“THEY NEVER SUBTRACT AND CANCEL”
Apart from when they do. And that’s when the uncertainties are not independent. I don’t know how many times I have to repeat this before you acknowledge that not all uncertainties are random.
“The equation for σ is “SEM • √n”.”
Why are you so obsessed with that truism? Of course it’s true because the SEM is σ / √n. It’s just that is more useful as you normally a good estimate if σ, and can derive the SEM from that. But you rarely know the seam without knowing the standard deviation.
“The GUM defines the standard deviation σ as the standard uncertainty. ”
Not if a mean they don’t. The uncertainty of the mean of multiple measurements is the SEM, or whatever they call it.
“…but that the standard deviation defines the dispersion of observations of measurements, that is measurement uncertainty.”
No, that’s your misunderstanding of the GUM’s definition.
“Apart from when they do. And that’s when the uncertainties are not independent. I don’t know how many times I have to repeat this before you acknowledge that not all uncertainties are random.”
Independence is *NOT* a sufficient criteria for total cancellation. Total cancellation seems to be your goal. If the dependence is not -1 then you will never get total cancellation and your use of the SEM as the measurement uncertainty will be incorrect. And the measurement uncertainties *will* add if you don’t get total cancellation.
Once again, you are living in statistical world and not the real world.
If you can apply EXACTLY the same force of the anvil of a micrometer on the measurand EACH AND EVERY TIME then you are a better man than any machinist I have ever known. If you can’t then the measurement uncertainties will ADD, not subtract.
“Independence is *NOT* a sufficient criteria for total cancellation.”
And again, Tim is unable to see that I’m saying “Not independent” even when he quotes me.
Your lack of reading comprehension skills is showing again.
Do you see the word “total” anywhere in what I said?
“Do you see the word “total” anywhere in what I said?”
Your lack of reading comprehension skills is showing again.
Total cancellation is *NOT* the same thing as TOTAL INDEPENDENCE – and it was independence that was the subject in your sentence, not cancellation.
Perhaps it’s your writing that’s incomprehensible, not my reading skills.
I’ve still no idea what you think you are arguing. You claim I think that all uncertainties are random – i.e. independent. Then when I give an example of dependent uncertainties cancelling out when subtracting, you claim I only believe that because I believe all uncertainties are random.
Then you start throwing the word “total” about as a distraction.
Then you expect me to know that when you claim you never used the word “total” in your comment, you actually mean you used it lots of times, but not in context I was meant to guess you were only asking about total independence.
A sentence like “Independence is *NOT* a sufficient criteria for total cancellation.” is meaningless, unless you think someone was claiming independence is a sufficient criteria.
What I’m actually saying is that “total non-independence is a sufficient criteria for total cancellation”, provided by “total non-independence” you mean a correlation of +1, and the same variance in each variable.
random means the next value is not predictable. Independence means prior choices doesn’t affect the next value.
Being dependent does *NOT* mean you can predict the next value.
You don’t *subtract* uncertainties. You ADD uncertainties. Cancellation only implies that the addition method is is less then the worst case of direct addition. Root-sum-square is an ADDITION of uncertainties, not a subtraction.
2+2 = 4, sqrt{2^2 + 2^2] = 2.8
Both are additions, not subtraction.
An uncertainty of +/- 2 is an interval of 4. Added directly the uncertainty becomes 4 meaning the interval becomes +/- 4 = 8. You do *NOT* do (+2 – 2) to get the interval as 0.
An uncertainty of +/- 3 is an interval of 6. Two uncertainties of +/- 3 added directly becomes 3+3=6 or an interval of 12. You do *NOT* do +3-3 = 0.
Uncertainty is related to variance. Variance is *NEVER* negative because its components are always squared. (+2)^2 = 4, (-2)^2 = 4.
This is at the base of your misunderstanding that the SEM can be the measurement uncertainty of the average. The measurement uncertainty of the average is NEVER the SEM. It is always inherited from propagating the measurement uncertainties of the parent distribution components.
Random, independent, and Gaussian only means that the average is the BEST ESTMATE of the parent distribution average. It tells you NOTHING about the measurement uncertainty associated with the parent distribution and therefore with the measurement uncertainty of the average. Assuming random, Gaussian, and total cancellation of the measurement uncertainty is the same idiocy that you and climate science uses to dismiss measurement uncertainty so you can claim you know the “global average temperature” down to the hundredths digit.
The measurement uncertainty of a random, Gausian, and indepeendent distribution is related to the square of its standard deviation, i.e. the variance Unless every single data point in the parent distribution is exactly equal the standard deviation can *not* be zero so neither is the variance or the measurement uncertainty. When quoting the best estimate of the measurand quantity being measured you state it as “best estimate +/- measurement uncertainty”. And that measurement uncertainty is associated with the standard deviation of the random and Gaussian distribution – it is *NOT* the SEM. The SEM is only a statistical descriptor for how well you have located the parent distribution average, i.e. the best estimate of the value.
Why you insist on subtracting measurement uncertainty is just proof that you have *never* bothered to actually understand the concepts of metrology.
“You don’t *subtract* uncertainties.”
Your problem is you are not looking at the full general equation. Taylor only gives the equation for independent random variables, but Bevington and the GUM show the full equation. Equation 13 in the GUM. That includes a factor for correlation. If there is no correlation, that becomes zero and you get the standard equation, which is a lot simpler.
But when you have some correlation you need to include that factor, and it will be negative if either the correlation is negative, or if the partial derivative is negative. This is the case when you are subtracting two correlated inputs.
“An uncertainty of +/- 2 is an interval of 4. Added directly the uncertainty becomes 4 meaning the interval becomes +/- 4 = 8. You do *NOT* do (+2 – 2) to get the interval as 0.”
You do if the ±2 have a correlation of 1. That means that if one value is +2 the other will also be +2. +2 – +2 = 0.
“Uncertainty is related to variance. Variance is *NEVER* negative because its components are always squared.”
You still don’t get that uncertainty is always positive, squaring is irrelevant. What is relevant is that all the partial derivatives in equation (10) are squared, hence even the negative ones become positive. That’s why random uncertainties add even when the function involves subtraction.
But the extra bit for covariance in equation (13) is not squared. The partial derivatives may be negative, and also the covariance u(x_i, x_j) may also be negative. Hence that part of the uncertainty may be a subtraction.
“This is at the base of your misunderstanding that the SEM can be the measurement uncertainty of the average.”
And there we go into another distraction, that has nothing to do with your previous point.
“Assuming random, Gaussian, and total cancellation of the measurement uncertainty is the same idiocy that you and climate science uses to dismiss measurement uncertainty so you can claim you know the “global average temperature” down to the hundredths digit.”
It’s pointless me trying to explain to the brick wall yet again, that assuming random uncertainties is the exact opposite of what I’m doing. Random uncertainties (if you are allowed to call them that) will just add in quadrature, when subtracting values.
You have *NEVER* bothered to actually understand how this all works.
For the GUM, Eq 13
x_i, and x_j are components in a functional relationship, e.g. y = x_1 + x_2
A negative correlation of x_1 and x_2 means that when x_1 goes up then x_2 goes down such that y remains the same.
The result of this is that the variance of y is reduced by the negative correlation since it is always the same value. If the uncertainty of y is its variance then the uncertainty in y becomes 0.
You totally miss understanding what this actually means. It doesn’t *NOT* mean that the inaccuracy of y becomes zero. It only means that the estimated value of y remains constant. They are *not* the same thing.
What you also miss is that this requires your typical meme of “all measurement uncertainty is random, Gaussian, and cancels” to be assumed.
As I keep telling you, YOU DON’T EVEN REALIZE WHEN YOU ARE USING THE MEME OF “ALL MEASUREMENT UNCERTAINTY IS RANDOM, GAUSSIAN, AND CANCELS”.
You *still* don’t get that the uncertainty contribution won’t be negative because you *have* to use relative uncertainties added in quadrature. When added in quadrature what happens to the negative sign from the partial derivative? You’ve never actually figured out what Possolo did with the barrel, have you?
The usual deluded gish gallop. Really not worth going over every point when Tim makes it abundantly clear he is unable learn anything.
It’s sad because the first part is more or less correct, but he won’t draw the obvious conclusions. Negative correlation means when one thing goes up the other comes down.This means when you add them the correlation will be cancelled. Same if you have positive correlation and subtract the two values. He agrees that this will mean the variance will come down. But the he asserts
“It doesn’t *NOT* mean that the inaccuracy of y becomes zero. It only means that the estimated value of y remains constant. They are *not* the same thing.”
And he doesn’t seem to understand what these variences are. In this case the standard deviation of each input is the measurement uncertainty – both from random and systematic factors. If there is a complete correlation between the two, it means there is no random factor and both have the same systematic factor. In that, and in that, implausible situation will the combined varience be zero, and you have a difference with no uncertainty.
However, more usually there will be a combination of random factors and uncorrelated systematic factors along with some common systematic factors. In that case the uncertainty will not reduce to zero. But it will eliminate the common systematic factor..
That is all I have been saying all along.
But then comes the nonsense. He just keeps repeating the lie that I believe all uncertainties are random, still with no comprehension that this is about systematic errors, not random uncertainties. And for good measure he starts writing everything in all caps, a sure sign if an unsafe argument, and ignoring WUWT policy to boot.
Then we have
“if either x_1 or x_2 has systematic measurement uncertainty, then the correlation factor in Eq 13 can’t be used.”
Wrong. The point of the GUM’s uncertainty paradigm is to say that systematic and random factors can be handled by the same equation. What should be obvious though is that if you have factors of uncertainty that are common to different inputs, then you have dependency and have to use equation (13).
“if either x_1 or x_2 has an asymmetric measurement uncertainty then the correlation factor in Eq. 13 can’t be used. A Type B uncertainty estimate should be applied instead.”
You can use equation 13 as an approximation, just as you can with non linear functions. Or you can use Monte Carlo methods. But this has nothing to do with the question of systematic errors cancelling due to subtraction. It’s just a typical Gormanesc distraction.
“Temperature measurements from different locations do *NOT* form a functional relationship.”
Of course two different locations can have a functional relationship. But it’s irrelevant to the case of an anomaly, which is subtracting measurements made at the same location.
“Correlation between locations, especially those more than a few feet apart, is primarily seasonal and disappear if the seasonality correlation is removed.”
He’s just trying to kill with an overdose of irony at this point.
“You *still* don’t get that the uncertainty contribution won’t be negative because you *have* to use relative uncertainties added in quadrature. ”
These are not relative uncertainties and the term for covarience is not squared.
“You’ve never actually figured out what Possolo did with the barrel, have you?”
What Possolo does in his private life is up to him. If you mean the example of the water tank, then I keep demonstrating how it’s done. It’s just the simplification of equation (10) where the function consists entirely of multiplation, division, and raising to a power. It’s given adequation (11) in the GUM, and is easy to derive from (10). You just divide the entire equation by the square of the combined value. This gives you an equation consisting of relative uncertainties.
But the thing Tim never seems to understand is that you can only do this when everything is multiplation. It does not work when adding or subtracting. As such it’s completely irrelevant to the question of subtraction. And also only works if all your uncertainties are independent.
You have *still* not studied anything on metrology.
Every single expert I have access to says systematic uncertainty can *NOT* be analyzed using statistical methods. You were given the quotes earlier in *this* thread.
Equation 10 and its related equations ALL assume random, not systematic.
Of course it applies. One more time. Take two boards from different production lines whose lengths are negatively correlated because of tolerances in the production lines. You are suggesting that the measurement uncertainty of the total length (i.e. y = x1 + x2) when they are used in a bridge span will be zero. Why don’t you ask a carpenter about that some time?
“Of course it applies. ”
More evidence that Tim never reads what he’s replying to. My comment about not applying when adding or subtracting was about using relative uncertainty, as in equation (11). Tim’s reply never addresses that, but is arguing about a correlated measurements.
“This means when you add them the correlation will be cancelled.”
The correlation IS NOT CANCELLED. It remains. Your statement is idiotic in the extreme.
What it truly means is that the measurement uncertainty of the functional relationship becomes INDETERMINATE. If there are two variables in the functional relationship whose correlation is -1, it does *NOT* mean that the measurement uncertainty becomes zero. If both are off by 1 unit because of systematic uncertainty, the output of the functional relationship will not be accurate, it *will* have a measurement uncertainty due to the systematic uncertainty. Since the measurement uncertainty propagation equation relies on the assumption of no systematic uncertainty, it is the entire equation that doesn’t truly apply in the real world of field measurement where systematic uncertainty can’t be assumed to be zero.
“And he doesn’t seem to understand what these variences are. In this case the standard deviation of each input is the measurement uncertainty”
You are *STILL* stuck in statistical world with the assumption that all measurement uncertainty is “random, Gaussian, and cancels”. This meme *ONLY* applies in one specific case and it is barely true even then. It only applies when you are measuring the *same* thing using the *same* instrument under the *same* conditions at the *same* point in time.
You can deny that this meme colors everything you post but it still just comes shining through in everything you post!
First, there is nothing defined as random uncertainty in the GUM. There are Type A and Type B uncertainties that are standard deviations. Uncertainties as standard deviations are always positive.
Second, nice try to deflect into another issue. Your post pretty much illustrates your misunderstanding of measurements and how they are evaluated.
Let’s discuss what an input quantity is. An input quantity is a measure of a unique physical characteristic. Different input quantities can be combined through a functional relationship which provides a unique value for the measurand.
Each input quantity can have multiple measurements that are part of a random variable. Each unique random variable can have its own Type A statistical evaluation to determine a mean and standard deviation for that input quantity. The standard uncertainty of the measurement of each input quantity is the standard deviation and the mean value is the center of the uncertainty interval.
For our purposes, temperature measurements are independent, i.e., they are not correlated. The measurement of today has no direct effect on the measurement tomorrow. They are both unique random variables with unique quanties contained in them.
Do not conflate direct correlation with auto-correlation, they have different definitions and conotations.
Correlation requires a functional definition with input quantities that vary in relation to each other. An example is PV = nRT. With “nRT” constant, PV are related, that is, when P goes up, V goes down. That is, they are correlated. The measurement of one depends on the measurement of the other.
Auro-correlation is being related in time. That becomes an issue in trending time series. The measurement at T_(t+1) is not directly affected by the value of the measurement at T_(t). Therefore the two measurements are independent, but auto-correlated.
End result, GUM 5.2 is not applicable to temperature measurements.
“Uncertainties as standard deviations are always positive.”
Thank you. You finally accept what I was telling you and Pat Frank.all those years ago.
“Second, nice try to deflect into another issue.”
I’m not sure what issue you are talking about. I’m just trying to explain why it’s nonsense to claim that I believe all uncertainties are random when I’m actually talking about systematic effects. Pointing out that if they are random you would not see the result I’m explaining, is hardly a distraction.
“The standard uncertainty of the measurement of each input quantity is the standard deviation and the mean value is the center of the uncertainty interval.”
You keep getting this part wrong. If the measurement of an input quantitty is derived from the mean of multiple measurements, then its uncertainty is the uncertainty if that mean, given by the “experimental standard deviation of the sample mean”.
“For our purposes, temperature measurements are independent, i.e., they are not correlated.”
What purpose is that? Of course temperatures are correlated.
“The measurement of today has no direct effect on the measurement tomorrow.”
Complete nonsense. You are just claiming auto-correlation doesn’t exist.
“Do not conflate direct correlation with auto-correlation, they have different definitions and conotations.”
Why? Both are correlations. They will have the same effect when using equation (13). And this is just another distraction from the main point. What happens when you subtract one value from a other when the values are correlated.
“Correlation requires a functional definition with input quantities that vary in relation to each other.”
It doesn’t require you know why two values are correlated, if that’s what you are arguing. Knowing that two values are correlated may indicate a hidden variable, but you don:t need to know what it is to see the correlation.
“Therefore the two measurements are independent, but auto-correlated.”
What do you think inpendent means? In probability theory indepence implies zero correlation. The converse is not true. It’s possible for two things to be uncorrelated but but not independent. But it’s impossible for some thing to be independent and correlated.
And none of this has anything to do the question of anomalies. We are not talking about auto correlation on any obvious sense. Just the correlation that exists when using the same station in the same location at the same time of year.
Sorry dude. One sample random variable with several observations is ONE measurement. To use the SDOM requires multiple measurements. You’ve already been shown this in Dr, Taylor’s derivation for the SDOM. Why do keep acting like it has no relevance?
This is also confirmed in GUM 4.1.4, 4.1.5,and 4.1.6.
It would do you tremendous good to STUDY metrology rather than creating stuff out of the air.
“One sample random variable with several observations is ONE measurement.”
With an uncertainty determined by the experimental standard deviation of the mean, or SEM. You keep insisting I need to study metrology, but it’s so obvious that however much you have studied it, you still don’t understand it.
“To use the SDOM requires multiple measurements.”
Of course.
“Why do keep acting like it has no relevance?”
What particular relevance does it have? Why do you think Taylor’s derivation is more appropriate than any standard derivation?
“This is also confirmed in GUM 4.1.4, 4.1.5,and 4.1.6.”
What’s confirmed? You really need to spell out exactly what you think these three sections say, and why it’s inappropriate to use the uncertainty of the mean when using the mean of several measurements as the input quantity. I can see nothing in any of those sections to say otherwise.
Meanwhile, if you want an actual example of doing just that look at example H.2 from the GUM, approach 1. Inputs are means of 5 measurements for each input, and the uncertainty is the Experimental standard deviation of mean.
Show us in the GUM references 4.2, C2.12, C.2.21, or C3.3 where it shows the standard deviation of the mean as the Type A standard uncertainty.
What is more appropriate is for you to show a metrology text that has a different derivation using different assumptions for the value of the standard uncertainty of the mean.
You apparently do not read well. I’m not sure what you think Xᵢ,ₖ actually means mathematically. Here is the text from 4.1.5.
In case you have trouble reading I’ll reiterate. “DETERMINED FROM THE STANDARD DEVIATION ASSOCIATED WITH EACH INPUT ESTIMATE xᵢ, TERMED STANDARD UNCERTAINTY.
Again, you are failing to prove anything since you never show any references to support your assertions. I do not consider you a metrology expert since you can not provide ANY metrology references. That shows you have never spent any time studying the subject.
“Show us in the GUM references 4.2, C2.12, C.2.21, or C3.3 where it shows the standard deviation of the mean as the Type A standard uncertainty.”
This is just getting ridiculous.
4.2.3 is the part of 4.2 that explains how to calculate the standard deviation of the mean, and that it’s the appropriate value to use for the uncertainty of an input quantity derived from the mean of multiple measurements.
All the other sections you list are just defining what standard deviation means, which in this case is the standard deviation of the mean.
And all of this is still just you distracting from the point about what how you account for dependent variables when using the general equations.
“You apparently do not read well”
The usual insult whenever I ask one of the Gormans to explain themselves.
“I’m not sure what you think Xᵢ,ₖ actually means mathematically.”
In this case X is used to indicate input quantities. The subscript i indicates the specific input quantity and the k is either a specific measurement of that input quantity, or a specific set of measurements across all of the Xᵢ.
“DETERMINED FROM THE STANDARD DEVIATION ASSOCIATED WITH EACH INPUT ESTIMATE xᵢ, TERMED STANDARD UNCERTAINTY.“
I love the idea that you think writing things in all caps makes it easier to read. Rather than just making you seem deranged.
Still not sure you understand what it’s saying. Xᵢ is an input quantity. It has an associated standard uncertainty. If the estimate for Xᵢ has been obtained from the mean of multiple measurements, that uncertainty will be the uncertainty of the mean, which they call the experimental standard deviation of the mean.
Did you look at example H.2. to see the actual demonstration of doing that?
“Again, you are failing to prove anything since you never show any references to support your assertions.”
No. I keep providing you with references and you fail to understand them. 4.2.3 has been shown to you multiple times. I’ve just given you an actual worked example in H.2.
“I do not consider you a metrology expert…”
Good. Because I’m not. But I don;t need to be any sort of expert to be able to understand some basic statistics, and to be able to read an equation. And I certainly don’t need to be an expert to see through all those claiming some sort of special expertise here.
“That shows you have never spent any time studying the subject.”
It must rankle with you that despite a lifetime studying this subject, it’s so obvious that you don’t understand it.
The issue is measurements of temperature. You trying to bring up covariance simply illustrates that you do not understand the making of measurements.
They both use the word correlation but their meaning is totally different. You would know this if you had spent any time studying metrology and forecasting.
This illustrates exactly your problem discussing measurements. Physical measurements and how to make them are the priority, not statistics/probability. Statistics/probability is only a tool to characterize a measurement,. It is not the reason to take measurements.
In metrology, independence means the lack of connection between two or more variables, period, end of story. It doesn’t mean they can’t corelate, especially in time.
The old analogy applies here. CO2 correlates with postal rates and numerous other things. No one argues that they are not independent. Yet you say, “indepence implies zero correlation“.
“The issue is measurements of temperature.”
The issue was whether subtracting two values with a positive correlation reduces uncertainty. Covariance is the reason why.
“They both use the word correlation…”
Because they are both correlations.
“…but their meaning is totally different.”
Their meanings are exactly the same. Auto-corralation is just correlation with the same thing at a different point of time.
“You would know this if you had spent any time studying metrology and forecasting.”
You would know this if you were capable of understanding what you study.
“Physical measurements and how to make them are the priority, not statistics/probability.”
And there’s your usual whine. You just don;t like the conclusion so you say it doesn’t work in the real world. So what? Do you just want to ignore the effects of correlation? Or do you just want to make up whatever result is most convenient for your argument?
“In metrology, independence means the lack of connection between two or more variables, period, end of story. It doesn’t mean they can’t corelate, especially in time.”
From the GUM:
Do you have a reference in the GUM for your interpretation of independence?
The fact that temperatures are auto-correlated in time DOES NOT show correlation between different measurements. Go take a physical science course at the senior level in college.
If I took a measurement at a location on a hot day, then took another measurement the next day which was also hot, would you say I have two independent measurements? How much certainty would you have that they represented the average monthly temperature?
The UNCERTAINTY INTERVAL is typically given as +/- the standard deviation value!
Your assertion implies that all 2″x4″x8′ boards are always too long. I.e. 8′ + SD!
More idiocy.
Unfreaking believable. You speak about Type A and Type B uncertainty and then try to force everything into being a Type A uncertainty!
Just like you try to force everything into being “random, Gaussian, and cancels”.
More trying to force everything into “random, Gaussian, and cancels”. The mean of an asymmetric uncertainty interval is *NOT* the center of the uncertainty interval! It is just the “balance point”. It is *NOT* the best estimate of the value of the measurement when you have an asymmetric measurement uncertainty.
Once again, your meme of all measurement uncertainty being “random, Gaussian, and cancels” just comes shining through!
Auto-correlation is nothing more than a specific type of correlation. It is a measure of correlation of a variable with itself over time instead of the correlation of two variables.
The cause of the atuo-correlation of temperature is SEASONALITY. In essence the two variables involved at a single location are temperature and seasonality. However, they are the same for two different locations as well. Denver and Kansas City temperatures are seasonally correlated. It is the main reason climate science thinks it can say that the temperatures at locations separated by hundreds of km can be considered to be correlated.
GUM 5.2 provides no distinction of the cause of the correlation between two variables.
Do you even understand that you just said that the use of infilling and homogenization of temperatures between distant locations is not valid?
So funny. Tim’s obsession with me results in him venting all his usual spleen at a comment, seemingly oblivious to the fact it was written by his brother.
I think if the GUM doesn’t mention the term and show how it is derived, then it doesn’t acknowledge that the term exists. If you think another GUM definition is a synonym for “random uncertainty” then show what the GUM calls it and how it is defined.
You need to read the GUM sections on errors again until you understand it. Random and systematic errors are calculated after you know the total error and can then subtract one of the categories from the total. In other words, systematic error is determined by “total error – random error = systematic error”. That is why you must know the true value first in order to what the total error is and what the random error is.
Why do you think the GUM supersedes the old “error” paradigm? You can’t know a true value so you are screwed from the start.
You keep using the error paradigm and trying to jam it into the uncertainty paradigm. It just won’t work. It is why you can never show any references to support your assertions. Knowledgeable people have moved on from the error paradigm and no longer try to do what you are hoping to accomplish. Tell yourself, never use the word error again.
I don’t know what you are reading, but go back and reread it for understanding.
I really don’t know how you reach the conclusion that this doesn’t require measuring the same thing.
Let’s use the rules of logic here.
Let
“p” = you take independent observations using repeatable conditions
“q” = you have made the best estimate
Using logic:
Conditional -> If p, then q T/F
Contrapositive -> If ~q, then ~p T/F
Logic dictates that both must be true or both false.
If you declare the Conditional statement false, then you have contradicted the GUM. Good luck with that.
“I don’t know what you are reading”
I’m talking about GUM B.2.15, as should be clear from the comment. You pretend it was referring to 4.2.1. Very dishonest.
As I said B.2.15 is just the definition of repeatability.
“I really don’t know how you reach the conclusion that this doesn’t require measuring the same thing.”
4.2.1 says that in “most” cases, if you have n observations under repeatability conditions of something varying randomly, then the best estimate of the expected value of that thing is the mean of the observations.
“I really don’t know how you reach the conclusion that this doesn’t require measuring the same thing.”
I’m really not sure what point you are arguing at this stage. I think this started as me asking you if you considered a population mean as measurand. Regardless, 4.2.1 does not say repeatability is required in order to have a measurement. What it says is that is the best estimate if you have repeated measurements.
Section 4.2 mentions other ways of obtaining type A uncertainties, and finishes
In any event, none of this explains whether you think a mean of a sample can be considered a measurement, and if it can’t why you think the GUM is the best place to understand sampling evaluation.
“Logic dictates that both must be true or both false.”
The truth here being that you have the best estimate. But in the real world you often can’t get the best. Then again, in some cases you might get a better estimate by using a more accurate instrument rather than taking a large number of measurements with a poor quality instrument.
Because repeated measurements of the same thing can define the values that encompass the dispersion of measurements around the mean.
Dr. Taylor’s derivation of Standard Deviation of the Mean (SDOM) defines the requirements needed to infer the dispersion of value for a single quantity. Read it until you understand the requirements.
That is not true. See GUM 4.2.1. It defines what the average of multiple observations of an input quantity. However it does require the same conditions of measure. See GUM B.2.15. Part of this is measuring the SAME MEASURAND.
GUM F.1.1.2 tells you how to treat uncertainty when dealing with changing “properties” of a measurand. What do you think the uncertainty Tavg is using this definition? How about Tmonth_avg?
“Because repeated measurements of the same thing can define the values that encompass the dispersion of measurements around the mean.”
There you go again. What mean are you talking about? The mean of a population of different things, or the true value of a single thing?
“Dr. Taylor’s derivation of Standard Deviation of the Mean (SDOM) defines the requirements needed to infer the dispersion of value for a single quantity.”
Because he’s demonstrating the SEM in the context of improving the measurement of a single value. You seem to have this weird idea that Taylor invented the notion of a SEM. And your so called “requirements” are only related to the specific way he’s proving the equation.
“That is not true.”
So are you now saying that a population mean is a measurement?
“GUM F.1.1.2 tells you how to treat uncertainty when dealing with changing “properties” of a measurand.”
No it doesn’t. It’s explaining that you have to be careful in what’s meant independent observations. If you measure the same thing repetitively, you can say each measurement is independent with regard to that one thing you are measuring, but not independent with regard to that one thing as representing all things of the same category.
E.g. if you measure the same piece of gold 100 times to determine it’s density, those 100 observations may be independent measure of that piece of gold, but you do not have 100 independent measurements of the density of gold because they are all dependent on that one specimen.
It would be the same regarding temperatures. If you take 365 daily measurements from the same station, under identical conditions, you have a large sample of measurements for that one station under those conditions, but you do not have a large sample of global temperatures.
“What do you think the uncertainty Tavg is using this definition?”
I’m not sure what your question is. Do you mean the uncertainty of Tavg? How you would estimate the uncertainty would depend on a lot of things, including the measurement uncertainty. But it depends on what you want the temperature to represent. Is it just what the temperature was as that station on a single day, or do you want to treat it as representative of the whole month or region?
Multiple people have tried to explain this to you in the past. You just let it go in one ear and out the other just like you always do.
The mean is a statistical descriptor of a set of data. The mean is *NOT* itself a piece of data, it is a descriptor of pieces of data.
The mean is *NOT* a measurement. it is *NOT* data. Its only use is in understanding the data. The mean can be used to draw conclusions about the data but those conclusions are subject to the same uncertainties as the data itself. It’s why you can’t reduce the uncertainty of a set of data by more precisely locating its “average”. That “average” value isn’t a more accurate measurement, it’s just a more precise descriptor. The same uncertainty still carries on to that descriptor from the actual data itself no matter how many digits you use in calculating the average.
Only in statistical world do statistical descriptors become actual measurement data instead of tools to understand the data itself.
It’s why *all* of the applicable statistical descriptors should be provided at each step of analysis of data. It’s why the variance *and* average should always be provided in climate science but never is. It’s why weighting based on variances should be used when calculating averages in climate science but never is. It’s why measurement uncertainty should *never* be assumed to cancel in climate science but always is.
YOU *do* live in statistical world. It’s self-evident in everything you assert.
https://www.wmbriggs.com/post/17158/
Excellent!
He has excellent treatises on climate science foibles.
It’s why statisticians like bellman will NEVER understand uncertainty. They believe the mean is the EXPECTED value, ALWAYS. Nothing but the average means anything. The long shot horse will never win. Craps will never happen. The average of an asymmetric distribution is still the EXPECTED value, not the mode.
When the GUM says you can never know the true value, statisticians just say that you haven’t averaged enough values.
“It’s why statisticians like bellman”
Stop claiming I’m a statistician. It’s flattering, but this is purely a hobby for me.
“They believe the mean is the EXPECTED value”
The mean is the expected value, so to speak. That’s what expected value means. the mean of a distribution. We’ve been over this before, and you still some to not get that the term “expected value”, even if you write it in all caps, does not mean the value you always expect to get.
“The long shot horse will never win.”
Only someone who doesn’t understand probability would say that. A long shot by definition has a non-zero probability therefore will sometimes win. Just not as often as a dead-cert.
“Craps will never happen.”
Nobody who reads any of your comments would believe that.
“The average of an asymmetric distribution is still the EXPECTED value, not the mode.”
Again, yes, that’s what expected value means. The mode value will happen most often, but the expected payoff depends on the expected value.
“When the GUM says you can never know the true value, statisticians just say that you haven’t averaged enough values.”
Quote one statistician who says that. Unless by enough values you mean an infinite number of values, and unless you are assuming there is zero systematic error, then you will never know the true value. And even then true value is never going to be a really true value, given the uncertainty of the universe.
“The mean is the expected value, so to speak”
This is *NOT* a general truth. In any asymmetric distribution the mean is *not* the expected value, the mode is. Once again we see the meme of “random, Gaussian, and cancels” showing up in your assertions. You don’t *have* to explicitly state that you are using that meme, it is self-evident in every assertion you make!
That is *NOT* what the mean is. The general definition of the mean is the balance point of a distribution. It is only the expected value in certain cases. Your assertion here is one big clue that you assume everything is Gaussian – including measurement uncertainty!
ROFL!!!! Did you actually read this before you hit the post button?
“Again, yes, that’s what expected value means. The mode value will happen most often, but the expected payoff depends on the expected value.”
Total and utter bullshite! So if the mode is 10 and the mean is 5 you would put your money on the 5 showing up in the next draw from the population? Wow! The casino’s must love seeing you come in the door!
Then how can you assume you know the systematic uncertainty of each and every measurement? If you can’t know the true value then you can’t know the components making up that true value either!
“This is *NOT* a general truth. In any asymmetric distribution the mean is *not* the expected value, the mode is.”
This is just getting pathetic. I keep explaining what excepted value means in the context of a distribution. I’ve given you a t least three references in this comment section alone. Yet all you do is keep coming back with “na, not so!”. If you think expected value means the mode of a distribution, you need to provide evidence for that.
“Once again we see the meme of “random, Gaussian, and cancels” showing up in your assertions.”
And lying about me is not evidence. Actually I suppose it’s evidence that you have no argument.
“It is only the expected value in certain cases.”
Evidence?
“ROFL!!!! Did you actually read this before you hit the post button?”
Yes, because unlike you, I actually know what expected value means. You’ve just guessed. Have you considered asking your AI friend to explain it to you?
“Total and utter bullshite! So if the mode is 10 and the mean is 5 you would put your money on the 5 showing up in the next draw from the population? Wow! The casino’s must love seeing you come in the door!”
Please, at least pretend you are trying to understand. The expected value is what you expect to win on average. Say I play a game where I pay $1, 2 dice are rolled if they add to 7 I get $4, any other number and I get nothing. Win and I’m up $4, lose and I’m down $1. Is that a good bet? Using the expected value, and assuming fair dice, the odds of getting 7 is 1/6. The expected value is the mean of all possible outcomes, so -5/6 + 4/6 = -1/6. On average I lose 1/6 of a dollar on each bet, or the expected value of a bet is -1/6 of a dollar. I conclude this is not a good bet.
If I thought the mode was the expected value, the that’s 7, so I would conclude it’s a good bet as I’m always expecting to win.
“Then how can you assume you know the systematic uncertainty of each and every measurement?”
You don’t. You might try to estimate it, but there will still be an uncertainty int hat estimate.
“This is just getting pathetic. I keep explaining what excepted value means in the context of a distribution.”
from Gemini: “Expected value is a measure of central tendency; a value for which the results will tend to”
The central tendency in an asymmetric distribution is the MODE. You can try and fight this all you want. It’s just a basic truth. The values in an asymmetric distribution do *NOT* tend to the mean. If the mean happens to equal the mode then so be it – but that is *NOT* the general case.
Realty! The central tendency in a distribution is to the most frequent value. That is the MODE for the general case, not the mean. The mean only becomes the central tendency is certain cases!
Now you are just making up crap! You’ve had to throw in a confounding variable as a red herring.
“The expected value is the mean of all possible outcomes”
And you somehow think this is describing an ASYMMETRIC distribution?
Then how can you assume they are equal?
Let me add that in metrology, the term “uncertainty interval” refers to a range of values within which the true value of a measured quantity is believed to lie, with a stated level of confidence.
The Cconfidence Level tells one the probability that the interval contains the true value, often 95%.
Saying the mean defines what one expects to win, illustrates one doesn’t understand what measurement uncertainty is all about. That is saying it is a game of probability. It is not.
It is a tool to DEFINE a central value and an interval where the true value may exist. That is why intervals are becoming the accepted method of quoting an uncertainty interval. By using a central value, the uninitiated assume it is the true value, and it truly is not that.
“Let me add that in metrology, the term “uncertainty interval” refers to a range of values within which the true value of a measured quantity is believed to lie, with a stated level of confidence.”
Except that’s exactly what the GUM says not to do. They say you should not think of an expanded interval as representing a specific level of confidence.
“That is saying it is a game of probability. It is not.”
You’ve literally just described the confidence interval in terms of probability. The expected value of a 95% confidence interval is 0.95. That is you will have the true value within the confidence interval 95% of the time.
I keep saying that if you want to use a different definition of expected value to apply to measurements, then point to a reference. But all this started becasue Tim insists that statisticians think that expected value means the value you will always get. And I’m just pointing out that this is not what statisticians mean by expected value.
“That is why intervals are becoming the accepted method of quoting an uncertainty interval.”
Still waiting for your reference to that. I’m not sure why you keep avoiding this – you usually love to provide lengthy quotes.
You just love throwing manure against the wall to see if it sticks.
When you cherry pick, you can’t even do it correctly. That is not what the GUM says. In fact, your assertion is exactly opposite of what the GUM says. When are you going to tire of being incorrect about metrology.
Read 6.3.2 more carefully. It says that confidence interval has specific meaning in statistics and is only applicable to metrology if ALL uncertainties have been evaluated using Type A procedure.
Let’s see what the GUM actually says about “level of confidence”.
You need to lose the attitude that measurements are done so that statisticians can show how statistical analysis can provide numbers that are meaningful. Statistics are only a tool to create internationally DEFINED descriptors of a set of observations. The statistics are not the end all and be all.
The GUM goes to great length to educate one about Type B uncertainties. Type B evaluations are just as equal in quality as frequentist Type A evaluation. And, as I have told you, a simple arithmetic mean/mode and the entire range of observations can also be an appropriate statement of uncertainty.
“… manure … cherry pick … exactly opposite … incorrect about metrology.”
Yes, looking back I might have misremembered what the GUM says. I was picking up on the sentence
But they just replace “confidence level” with the term “level of confidence”.
They do say however that it’s difficult to establish an exact value.
6.2.3 says
And you missed the second part of 6.3.2
I’m not sure what the rest of your rant is about. You’ve just confirmed I was right about this being a “game of probabilities”. Contrary to what I thought they do say you should quote a probability associated with an expanded uncertainty.
“from Gemini”
What, gone of copilot? Or didn’t it give the answer you wanted?
: “Expected value is a measure of central tendency; a value for which the results will tend to”
Funny, here’s what Gemini told me in response to the question “What does expected value mean?”
and goes on to say
What prompt did you use to get such an uninformative response?
I also asked Gemini “Is expected value the same thing as central tendency?” and it said
“The values in an asymmetric distribution do *NOT* tend to the mean.”
The mean of the values do. I’m not sure what you mean by the values tend to. The more values you take the more the distribution of values will tend to the population distribution. That means the mean of the sample will tend to the population mean and the mode of the sample will tend to the population mode.
“Realty!”
Asserting something is not evidence.
“Now you are just making up crap!”
Please stop crediting me. This “crap” goes back to centuries
https://en.wikipedia.org/wiki/Expected_value
You are doing nothing more than saying the “average is the average”!
Circular logic.
“The mean of the values do. I’m not sure what you mean by the values tend to”
The MODE is the value that a distribution tends toward. It *has* to be. Otherwise it wouldn’t be the most frequent value in the distribution.
You keep trying to say that a distribution does *NOT* tend to the mode, that it always tends to the average! The AVERAGE tends to the average, a circular logic chop! That doesn’t mean the distribution always tends to the average – it DOESN’T!
A distribution only tends to the average if it equals the mode, e.g. a Gaussian distribution.
This is *STILL* nothing more than your mean that everything is random, Gaussian, and cancels.
“You are doing nothing more than saying the “average is the average”!”
I’m saying the expected value is defined as the weighted mean of a distribution. That is not circular logic, it’s simply explaining the definition.
“The MODE is the value that a distribution tends toward.”
Please try to explain what you mean by any of this. How does a distribution tend to a single value? Are you talking about a sample tending towards something as sample size increases?
“You keep trying to say that a distribution does *NOT* tend to the mode”
I’m saying it’s a meaningless statement. I’m sure it means something to you in Gormanland, but unless you define your terms it means nothing in the real world. As sample size increases a sample distribution will tend to the distribution of the population. It’s mode will tend to the mode of the population., it’s mean will tend to the mean of the population, etc. Saying the distribution will tend to the mode is meaningless.
“This is *STILL* nothing more than your mean that everything is random, Gaussian, and cancels.”
Please seek medical help.
“Expected value is a specific type of central tendency”
Get into a conversation with Gemini over what the central tendency of an asymmetric distribution actually is.
Do you actually understand what this is saying? It’s circular logic! It’s saying the average is the average because we defined the average as the Expected Value.
“Get into a conversation with Gemini over what the central tendency of an asymmetric distribution actually is.”
You can’t have a conversation with an AI, they are not real – they don’t understand what they are saying. Any meaning is in your own head.
And the question is about the meaning of expected value, not about central tendency.
“Do you actually understand what this is saying?”
Yes, becasue unlike you, I know what expected value means.
“It’s circular logic!”
It’s not logic, it’s just that they mean the same thing. Expected value means the mean of a distribution. It’s just another way of saying the same thing.
“It’s saying the average is the average because we defined the average as the Expected Value.”
Just hold that thought in your head, and reflect on it, and maybe you’ll begin to understand how much of an idiot you look when you keep denying it.
“And the question is about the meaning of expected value, not about central tendency.”
The central tendency *IS* the expected value. You are still using the circular logic of “the average is the average”. Duh!
“The central tendency *IS* the expected value.”
Just stop pretending you are willing to learn. The expected value is the mean, the mean is a type of central dependency. I’ve given you numerous references to it. You just repeat your mistake, refusing to accept there’s any possibility of your being wrong.
You’ll ignore this as much as the other sources, but Bevington has the same definition for expectation value.
Still waiting for a single reference that supports your case.
“The mean is *NOT* a measurement.”
Finally, an answer.
But why go on to talk about means not being data? It’s not relevant, and makes no sense. Of course a mean is a piece of data. Data is information. Any statistic is a piece of information.
Once again, you seem to only accept information if you can put it in your pocket.
“Only in statistical world do statistical descriptors become actual measurement data instead of tools to understand the data itself.”
I’m not the one claiming you can have a measurement uncertainty on a sample mean.
“It’s why weighting based on variances should be used when calculating averages in climate science but never is.”
Why would you use weighting based on variance? That only makes sense when you are measuring things with the same value.
“It’s why measurement uncertainty should *never* be assumed to cancel in climate science but always is.”
Meaningless. In what way does the mean not being data result in measurement uncertainty not being independent and random?
“YOU *do* live in statistical world. It’s self-evident in everything you assert.”
You mean all the times I’ve asserted that we live in a statistical world. Yes. Statistics is one of the best methods for understanding the world. That’s true even if you personally don’t understand the statistics.
A mean is a STATISTICAL DESCRIPTOR! It is *NOT* a piece of data. It’s why it is called the “best estimate” and not the “true value”. Did you not bother to read the link Jim sent you discussing this?
Information and DATA are two different things. I can perfectly accept that statistical descriptors can provide me information about the distribution associated with a set of data. But that does *NOT* mean that the information is “data”!
You have *never* bothered to grasp the concept of uncertainty or of the Great Unknown. The mean of a distribution simply cannot tell me what the next value *will* be – it only provides a clue as to what the next value is *most likely” to be in a perfectly symmetric distribution. But hardly anything in the real world is a perfectly symmetric distribution, not even coin flips or dice throws. Fortune tellers simply don’t exist in the real world.
The very fact that the mean of multiple samples has a distribution associated with it tells you that there is measurement uncertainty associated with the mean of a single sample being able to be 100% accurate when it comes to the population mean. That, in turn, implies a measurement uncertainty for the mean of a single sample.
The real world doesn’t follow the same rules that you go by in your statistical world.
More bullshite! Variance is a metric for uncertainty. The larger the variance the lower the hump surrounding the average value meaning values close to the average are just about as equally possible to actually be the average as the calculated average itself.
If the measurement uncertainties in two measurements are unequal then you simply cannot just average the two values. Doing so gives each measurement equal importance where the more accurate measurement should be given more weight.
Since variance is a metric for measurement uncertainty then the values used in calculating an average should be weighted based on their variance in order to give the most accurate measurements (smaller variance) more weight than less accurate measurements (larger variance).
You simply refuse to learn the basic concepts of metrology yet you have no problem contradicting those concepts as if you know better than all the experts.
It’s the same for climate science. Just jam everything together since all measurement uncertainty is random, Gaussian, and cancels. Therefore it can be ignored. Who cares about variance anyway?
“Information and DATA are two different things.”
I disagree, but it’s going to become another meaningless never-ending argument over the correct meaning of an ambiguous word. To help you out here’s a page that makes the distinction
…
https://www.geeksforgeeks.org/computer-organization-architecture/difference-between-information-and-data/
Lots of other sources make similar distinctions. I think these are really talking about a specific type of data, raw data, rather than using the word in it’s literal meaning. But lets not get bogged down in semantics again.
“More bullshite! Variance is a metric for uncertainty. The larger the variance the lower the hump surrounding the average value meaning values close to the average are just about as equally possible to actually be the average as the calculated average itself.”
You are going to have to explain exactly what you want to do with regard to the mean global temperature, and what variance you are talking about.
“If the measurement uncertainties in two measurements are unequal then you simply cannot just average the two values.”
If you are talking about measurement uncertainty, I’ll repeat that that only makes sense when looking at two measurements of the same thing. In that case you can assume that each measurement, treated as a random variable has the same mean but different variances, and you want to give more weight to the more accurate measurement.
But if you are measuring two different things with different values, it makes absolutely no sense to weight the average to the value of the more accurate measurement.
Say you measure two of your wooden boards. One has a length of 100cm with a standard uncertainty of 1cm, the other is 200 with an uncertainty of 2cm. So using inverse-variance weighting you have
(100/1 + 200/4) / (1/1 + 1/4) = 120cm
You’ve changed the average from 150 to 120cm just because the measurement was a bit more accurate. If you go by intervals, the smallest mean would be (98 + 196) / 2 = 147cm, and the largest would be 153cm. There’s no way the mean could be as small as 120cm and the best estimate of the mean is still 150cm. The different uncertainties affect the measurement uncertainty of the mean, but not the actual best estimate.
The measurements do *NOT* have to be of the same thing. Why do you think that the measurement uncertainty of the radius of a barrel is added in twice while the height is only added in once? They are *NOT* measurements of the same thing. Yet the measurement uncertainty of the radius gets a much greater weight.
Of *course* it makes sense. At least to anyone living in the real world. If you measure Journal1 with a micrometer that has an uncertainty of +/- .001″ and Journal2 with a micrometer with an uncertainty of +/- .0001 tell us exactly how *YOU* would do the average? Would you just add the two measurements and divide by 2? Or would you give the measurement from Journal2 more weight?
I can guess what you answer will be based on what you are trying to assert here!
Then why do we ever try to get better measurement devices to use in critical situations? Just average the less accurate measurements together and you’ll get a “best estimate” that is equivalent to the best estimate from a more accurate device!
“Why do you think that the measurement uncertainty of the radius of a barrel is added in twice while the height is only added in once?”
Because the radius is squared. I’ve explained this to you enough times – it’s all to do with partial derivatives. Absolutely nothing to do with weighting by variance.
“Of *course* it makes sense. At least to anyone living in the real world.”
Ah, that real world where averages can change substantially just because one value is measured more accurately.
“If you measure Journal1 with a micrometer that has an uncertainty of +/- .001″ and Journal2 with a micrometer with an uncertainty of +/- .0001 tell us exactly how *YOU* would do the average?”
(Journal1 + Journal2) / 2 would be the average. It might depend on why you want to know the average of two different sized journals.
“Then why do we ever try to get better measurement devices to use in critical situations?”
Because you want better measurements. Is this a trick question?
“Just average the less accurate measurements together and you’ll get a “best estimate” that is equivalent to the best estimate from a more accurate device!”
See Bevington for why that’s not a good idea.
tpg”“Then why do we ever try to get better measurement devices to use in critical situations?””
bellman: “Because you want better measurements.”
You just said to average the less accurate measurement with the most accurate measurement directly in order to find the average. How does *that* comport wtih getting better measurements?
You wind up discounting the better measurement when you average them directly! How does that give you a better measurement?
You can’t even be consistent within a couple of sentences!
“You just said to average the less accurate measurement with the most accurate measurement directly in order to find the average.”
Please try to understand the point. I said that if you averaged different things you could not use a variance weighted mean. I tried to explain why that was the case. You’ve had plenty of time to try to understand the point, and if you disagree, present your counter argument. Instead you just claim I’m saying something different and use it as yet another straw man.
If you measure the same thing, (or different things expected to have the same value), it makes sense to use inverted-variance weighting. That’s because you expect all the measurements to be random variables with the same mean – that of the
truevalue of the thing you are measuring. The weighting reflects the fact that the instruments with the greatest uncertainty are likely to havethe bigger errorthe widest uncertainty interval. Thus it makes sense to weigh the mean towards the results with the least uncertainty.But of you do the same thing when you are looking for the mean of measurements of different values, you are shifting that mean towards the values of the measurements with the greatest certainty. And that creates a bias in your average. If the thing you measure with an precise instrument happens to be small you get an average that is too small.
“Please try to understand the point. I said that if you averaged different things you could not use a variance weighted mean. I tried to explain why that was the case. You’ve had plenty of time to try to understand the point, and if you disagree, present your counter argument. “
My argument is that if you have measurements with differing variances you *do* have different measurement devices and protocols. When averaging you *must* give more weight to the most accurate. If you don’t then you are giving equal weight to the less accurate measurements. That makes no sense in reality at all.
You didn’t explain at all as to *why* you can’t weight temperature measurements based on their measurement uncertainty. You just stated that you can’t => and then turned around and said *I* have to justify why you should!
Taylor covers this in detail in Chapter 7 of his book. If what is being measured is the “global temperature anomaly” (i.e. the “same thing”), then the average of those measurements must be done by weighting each based on its variance. The variance of cold temperatures *is* different than for warm temperatures and those variances do carry over into the anomalies calculated from the actual temperatures.
In essence the probability of the average winds up being based on the sum of the squares of the deviations from the true value of each measurement divided by its corresponding uncertainty.
This actually even applies to the calculation of the mid-range daily temperature. Since the daytime temperature variance is different than the nighttime temperature variance the average of the daytime and nighttime temperatures should be done using weighting. It’s only when the variances are equal that you get the familiar (x+y)/2 for the mid-range value.
Do you understand that your assertion implies that you actually cannot average the measurements of different things and that carries over into measuring different temperatures at different locations at different times? What you’ve actually done is say that the global average temperature can’t be calculated because it’s based on the measurements of different things. It would be like averaging the length of two different kitchen tables to find the “average length of kitchen tables”! You don’t seem to be capable of even understanding the implications of your own assertions.
Your epic rant is is pointless unless you address the point I was making. You’r claim was that me thinking it’s unwise to use variance weighting in the mean of different things, implied I believed that there was no point in using better instruments when you you could just take an average of a large number of imprecise instruments.
Unless you can demonstrate why this isn’t a massive non sequitur, then everything else you write here is just a distraction.
To help you, here are three reasons why your claim doesn’t follow from your premis.
“See Bevington for why that’s not a good idea.”
It’s what *YOU* suggested doing!
bellman: ““But if you are measuring two different things with different values, it makes absolutely no sense to weight the average to the value of the more accurate measurement.””
“It’s what *YOU* suggested doing!”
You just can’t help lying. If you didn’t have any strawman arguments you’d have no arguments at all.
“bellman: ““But if you are measuring two different things with different values, it makes absolutely no sense to weight the average to the value of the more accurate measurement.”””
I’m not sure if you are claiming that what I said is the same as
If that’s what you are saying, you are beyond hope. You are just seeing the words I write and fabricating them into whatever fantasy you feel like.
You can’t even admit to what you said, can you? I QUOTED you.
And he still can’t see that what he quoted was completely different to what he claimed. If not trolling, it’s just sad, and if trolling it’s even sadder.
There is no bogging down. A statistical descriptor of a set of data is *NOT* itself a piece of data. The mean of a set of measurement data is *NOT* itself a measurement. It is nothing but a descriptor of the actual measurement data.
it is *truly* just that simple.
“A statistical descriptor of a set of data is *NOT* itself a piece of data.”
Well, I wasn’t sure the first time you said that. But now you’ve repeated the same assertion 100 times with no evidence, but more capital letters, I’m convinced.
The mean of a distribution is not data. It is mean of a series of observations (data) grouped into a random variable. It is the center point of that distribution if it is symmetrical.
The GUM calls it the estimated value and it is conditioned by also having a variance that defines the possible values a measurement could be. It is why you can never know the true value.
As a statistician, you place knowing the statistical parameters and statistics as the primary goal of making a physical measurement.
Physical scientists, engineers, and metrologists see statistical information as a defined process to evaluate and describe measurements such that anyone in the world can understand the meaning of a measurement. That is, the physical measurement is the primary goal.
There are proponents of using medians and the entire range of observations as the uncertainty. It is useful in defining the absolute limits of a measurement process. This is useful when refining physical constants. You need to expand your measurement horizons.
“The mean of a distribution is not data.”
Same point I made to Tim, could you provide your definition of the word data, otherwise this just becomes a meaningless semantic argument. I would say the mean is not raw data, but it is data in the sense that data is simply any information.
“The GUM calls it the estimated value and it is conditioned by also having a variance that defines the possible values a measurement could be.”
You keep saying things like this as if it’s something I disagree with. If estimated values are not data then that means no measurement is data.
“As a statistician”
I’m not, but go on.
“you place knowing the statistical parameters and statistics as the primary goal of making a physical measurement.”
Do I? I’d say that depends on the purpose of the measurements. Generally I would assume that the purpose of the measurement goes beyond merely knowing the measurement, but that isn’t necessarily about finding a statistical parameter.
“Physical scientists, engineers, and metrologists see statistical information as a defined process to evaluate and describe measurements such that anyone in the world can understand the meaning of a measurement.”
As would anyone.
“That is, the physical measurement is the primary goal.”
I’m not sure what point you are making. Is the primary goal to get physical measurements, or is it to provide data that can be used to describe things in a meaningful way?
“There are proponents of using medians and the entire range of observations as the uncertainty.”
The uncertainty of what? And can you get round to supplying a reference to these proponents?
I want to get me one of those instruments that measures the average value of something and save me the time and trouble of taking a lot of readings. 🙂
You reminded me of a process of peaking coils in an interstage point between amplifiers. With an analog voltmeter it is easy with hand eye coordination to find the maximum point. With a digital voltmeter, it takes forever. A scope can work, but sometimes seeing the peak is hard, kinda like with a digital voltmeter.
“measurement inaccuracies are random and uncorrelated rather than systematic”
How do you assume this when measuring different things with different things? How can you assume uncorrelated when they have the same confounding variable involved – the sun?
There is little justification for giving an uncertainty greater than to an order of magnitude, or 1 significant figure. To do so, simply means that one is displaying not only the first uncertain digit, but the next smaller one (1/10) as well, which is really meaningless. If one feels justified in displaying two uncertain digits, why not 3 or 4?
If one examines resolution uncertainty, you will quickly find out that the marked graduations on an analog device or smallest digit on a digital device is the recorded value of measurement. The “estimated” position between the graduations generally determines the uncertainty. That is why one additional digit normally defines the uncertainty, it is where the estimated (uncertain) value originates.
A didital meter is easier to use as an example. If the last digit is tenths, then the uncertainty is in the 100ths digit. One must assume that the uncertainty is ±0.05.
a small nit……that is the uncertainty of the device itself, not of the measurement process (operator, application, environment, etc)
I agree. Resolution uncertainty is only one item in an uncertainty budget.
That was SOP for surveyors back in the days of optical transits or a Brunton compass, especially for turning a small angle with only one or two digits. However, that was justified because the measurements were performed by the same operator, using the same transit, turning the same angle, with no other variables that might affect the measurement.
There is a little more to it than that. The assumption used for significant figures is that the next digit (1/10X) to the right of the of the last displayed digit (X) has a range of +/-5, thus meaning that the last digit (X) is uncertain and could be X+/-1.
Where as, if there is more information, such as obtained from a rigorous propagation of error calculation in quadrature, it may be possible to narrow that band to, say +/-4 or even +/-1. Yes, it is “rough” in the sense that it is only an order of magnitude, but commonly sufficient, only being off by, at most, 80% of the ‘shorthand,’ with the rigorous estimate always being better; that is the shorthand is is an upper-bound.
The problem with this is you can end up with phantom resolution. I don’t remember where I read about this but the author described it as leaving gaps in between the resolution values. In other words, the subsequent intervals are not adjacent so there are possible values that cannot occur.
As I get to your humorous ending, a revelation comes upon me and suddenly I understand the eight digits:
Most people are somewhat innumerate, or at least they only work with such numbers as they need, like shopping, where they ignore all the nines after the initial LowLowPrice msi.
Most people will immediately tell you that 34.5678 is obviously more than 400. I mean, look at all those digits!!!
That is their audience.
“It’s simply not possible to derive measurement accuracy to more decimal places than the originally recorded data.”
Well, say you have two temperature measurements: 60⁰ and 65⁰F. The average of the two is 62.5⁰. You see that on NWS F-6 data sheets all the time. It’s simply an average of measurements, not a measure of accuracy. But I agree, having several decimal places for temperature measurements is a waste of ink or electrons.
Even that is not accurate. Both 60 and 65 are measurements, so the result cannot be more accurate than the error band of those measurements. If those numbers were counts, you would be absolutely correct. But they are both approximations.
This is not an appropriate treatment using significant digits. The answer would be 63° ±0.7 using RSS. Or 63 ±1 for maximum uncertainty at 1 σ. Two sigma’s (expanded uncertainty) would be 63 ±1.4 or 63 ±2.
Here is what I have always operated with. It is a statement from a chemistry class at Washington Univ. ar St. Louis.
The intervals of uncertainty are the most important thing to take from this.
“The answer would be 63° ±0.7”
If you are quoting an uncertainty to 1 decimal place you should also quote the best estimate to 1 decimal place.
Show a reference that proclaims this.
What is the half interval of resolution? Why is the uncertainty in either 60 or 65 shown as ±0.5?
Using your assertion, the temperatures should be shown as 60.0 and 65.0, both with an uncertainy of ±0.05 and not ±0.5.
Here is something for you to think about. Using a digital device with only units digits, what is the value of the readings and the average and uncertainty?
“Show a reference that proclaims this.”
Bellman lets me take the easy ones.
https://phys.libretexts.org/Courses/Georgia_State_University/GSU-TM-Introductory_Physics_II_(1112)/01%3A_Introduction_to_Physics_and_Measurements/1.03%3A_Measurements_Uncertainty_and_Significant_Figures
“Using the method of significant figures, the rule is that the last digit written down in a measurement is the first digit with some uncertainty.”
The first digit with uncertainty is based on the resolution and measurement uncertainty of the component measurements. You can’t change this with math – especially with “averages”.
How many times have you instituted on references for this? We’ve discussed it many times and it seems like a pretty obvious rule. Here for example is Taylor (2.9)
And here’s a NIST document
https://www.nist.gov/system/files/documents/2019/05/14/glp-9-rounding-20190506.pdf
I can’t find a specific quote in the GUM, but all there examples of how to express uncertainty make it clear that the result should have the same significant digits as the uncertainty.
The estimated value should be given to the same DIGIT PLACE as the measurement uncertainty. Taylor covers this quite well. Stating a velocity measurement as 953.5 m/hr +/- 10 m/hr is nonsensical. You simply don’t know the estimate past the 10’s digit. It should be given as 950 +/- 10.
If the measurement uncertainty is +/- 10 you can’t decrease that by averaging multiple measurements. Any average should be stated based on the +/- 10 which forces the average to the same level of significant digits as the component values used to create the average.
“The estimated value should be given to the same DIGIT PLACE as the measurement uncertainty.”
Yes, that’s what I said. The least significant digit of your reported result should be the same decimal place as the least significant digit of your quoted uncertainty.
“Stating a velocity measurement as 953.5 m/hr +/- 10 m/hr is nonsensical.”
Correct, but you bring up an obvious problem. Is the 10 reported to 1 or 2 significant figures. Using Taylor’s rule it could be reported to 2 significant figures, but we have to assume that your uncertainty is 10, and not say 9 rounded to 10.
“It should be given as 950 +/- 10.”
But here you are assuming it’s only quoted to 1 sf. If reported to 2, it should be 954 ± 10.
“If the measurement uncertainty is +/- 10 you can’t decrease that by averaging multiple measurements.”
You can if you understand how it works. As I and others have explained to you at length. Do you remember the couple of exercises where Taylor specifically points this out?
You didn’t read Taylor did you?
The ±10 means the 4 in the value is not significant to the determination.
Here is Dr.Taylor’s rule 2.9
The order of magnitude of the uncertainty is the tens position.
You’ve never studied or needed to use physical measurements have you?
“You didn’t read Taylor did you?”
No. That’s why I never quote him.
“The ±10 means the 4 in the value is not significant to the determination.”
Only if your ±10 is written to 1 significant figure.
“Here is Dr.Taylor’s rule 2.9”
You mean the rule I quoted at the start of this?
On reflection it’s a bit ambiguous. Does he mean the order of magnitude of the most significant digit of the uncertainty or the second digit? This isn’t normally a problem for Taylor as he usually only writes 1 sf for the uncertainty, but what about the cases where the uncertainty starts with a 1 and he writes it to 2sf? Unfortunately I can’t find any examples where Taylor writes an uncertainty to 2 digits, so it’s a bit difficult to say what he had in mind.
On the other hand, the NIST document I quoted is quite clear
https://www.nist.gov/system/files/documents/2019/05/14/glp-9-rounding-20190506.pdf
Or any of their examples, eg.
So stating a reading as 90 ±0.5 is incorrect?
Why don’t you study some science instead of cherry picking.
Using your interpretation it should be written as 90.0 ±0.5.
Yet that means your measuring device has a resolution of 0.1. What if it is a digital device that has units digit resolution? Do you just assume the tenths digit is zero?
“So stating a reading as 90 ±0.5 is incorrect?”
I’d say so, yes. Otherwise, how do I know if you measured the value as 90.4 and rounded down, or you measured it on a device that only reports to the nearest unit? If you follow the GUM’s recommendation and are using a standandard standard uncertainty you should avoid the ± style. You might write 90.0(5) which would mean something different to 90(5).
What does Taylor say? If I remember correctly he says you should assume the uncertainty of a rounded value is ±1, which sidesteps the issue.
“Why don’t you study some science instead of cherry picking.”
Because this has little to do with actual science.ore to do with standards and style. And it’s a bit rich to accuse me of cherry picking when I keep going by what the various standards actually say. Whereas you seem to just pick anything that matches how you were taught to do it.
“Yet that means your measuring device has a resolution of 0.1.”
Only if you are expecting everyone to be using some archaic style that is not mentioned in any of the standards you insist I have to agree with. You claim I’m disagreeing with the GUM, yet you keep ignoring the section where they actually give you guidance as to how to express uncertainty.
You don’t know.
This is getting ridiculous. You have no idea what you are talking about.
Just FYI, that is one reason Uncertainty intervals are the preferred manner of stating the dispersion of measurements done on a measurand. The mean (center value) has no better chance of being the true value than any other value in the interval.
Better yet, go find an NWS manual for observors of LIG thermometers. Then ask yourself how uncertainies smaller than ±0.5 are determined prior to 1980.
“You don’t know.”
Yes, that’s my point. It’s better if you write the values in a way that removes that ambiguity.
“Just FYI, that is one reason Uncertainty intervals are the preferred manner of stating the dispersion of measurements done on a measurand.”
Please quote a reference. You keep claiming that the GUM is wrong to use standard uncertainties, and the preferred way is an uncertainty interval, but you never say where this preference is documented.
“The mean (center value) has no better chance of being the true value than any other value in the interval.”
That’s just wrong. If you are talking about the mean of a result of multiple measurements of the same thing, the distribution will tend to a Gaussian and that means the maximum likelihood, or the probability will be highest for the mean being the true value.
My high school physics teacher, who taught us about significant figures, would give this work an “F”.
That sort of math nonsense is common. Here in MA, the state forestry agency once gave a number for the amount of timber on all the land in the state- based on a federal census taken every so many years- where one acre in 6,000 is “carefully” measured. The figure given by the agency for the timber resource was, I think, to 6 decimal places. When I ranted about that to all the forestry folks in the state including the state officials- nobody responded.
https://pressbooks.nebraska.edu/chem1014/chapter/experiment-2-measurements-and-significant-figures/
How many physics, chemistry, engineering lab courses have this proscribed procedure?
Statisticians can’t even tell you why readings and measurements are rounded to the nearest graduation and quoted with an uncertainy that “covers” the uncertain value between graduations. Statisticians have never been reqiuired to justify a measurement when using a specific device nor tell a professor why they are CERTAIN what the value of a physical measurement actually is. Why is it one never sees any discussion of an uncertainy budget with detailed categories by warmists?
Yes, there are two kinds of numbers – counts and measurements. Counts are definite, it is how many of something there are. Measurements are not definite – they are approximations. When averaging approximations, the result can be no more precise than the least precise number in the process. Thus, if the measurements were to one degree, it would have to be plus or minus one degree for each number, as they are approximations themselves. When averaged, the result still carries that plus or minus one degree. It is impossible to average a bunch of approximations and get a result more precise than the measurements themselves.
In addition, temperature is an intrinsic element. That means that creating averages of intrinsic elements results in meaningless numbers. The takeaway is that anyone claiming to average temperature does not understand what they are doing. The Significant Digit rule comes into play in all measurements.
Temperature measurements reported beyond one or two decimals are meaningless as nothing in nature has a temperature that is that stable. Even water triple point baths used in calibration of thermometers are typically +/- 0.01 C and at best, under extraordinary precision control, +/- 0.002 C. Organizations or so-called scientists who report temperatures of anything beyond a properly determined number of significant digits should have the scientist license revoked.
The very basic technical issue faced by the temperature charters is that when multiple measurements are combined, the measurement uncertainty of the combination always increases above those of the constituents. This is of course totally unacceptable for them because it makes resolution of tiny rates of temperature changes untenable. Instead they invoke hand-waving and tortured statistics to claim that “error” somehow magically cancels.
They aren’t claiming that the original measurements are precise to 8 significant figures. Instead, they are claiming that averages of large numbers of measurements can have their precision increased by dividing by the square-root of the number of measurements of something that changes over time. Unfortunately, if that is done with the original (raw) data, the trend increases the variance. Even if the time-series is de-trended, it only gives a rough estimate (not 8 significant figures) of the random variations, which may or may not cancel, depending on what influences are causing the variations.
The problem is that those doing that misunderstand how to apply the theory. The data used have to be stationary, meaning that the mean and SD of the data set does not change with time. That means something like the diameter of a highly spherical, polished ball-bearing in a temperature-controlled environment, measured by the same calibrated instrument, having greater precision than the variance in the ball bearing.
An air mass or parcel of air does not meet that fundamental requirement because it varies over time with altitude, humidity, and barometric pressure; because of turbulence induced by obstacles and cross winds, the distributions of the fundamental properties of the air mass are heterogeneous. Basically, one never measures the same air mass twice.
The unstated assumption is that there is that the difference between measured air masses is statistically insignificant and therefore they can computationally treat all air masses as being identical, which is obviously false. It has to be demonstrated, rigorously, that any and ALL changes in the measured property, such as temperature, are random. The fact that the measurements are taken to determine the long-term changes of air temperature is de facto evidence that the time-series has multiple trends.
The accepted mean global temperature for 1850-1900 is widely regarded by critics as a “hot mess” — a figure heavily influenced by the limited and uneven network of reporting stations, and fraught with so many uncertainties that it essentially reflects station coverage more than actual global climate conditions.
The number and distribution of temperature stations globally in 1850-1900 were extremely limited, heavily biased to the Northern Hemisphere, and concentrated in certain populated regions. Vast areas of oceans, polar regions, and the Southern Hemisphere had very sparse or no direct measurements.
Measurement inconsistencies: Early temperature measurements were taken using less standardized methods, including manual readings and ship-based sea surface temperatures with varying biases (bucket sampling, engine intake), complicating the ability to unify or calibrate data accurately.
Proxy substitution: Where measurements are lacking, proxies such as tree rings, ice cores, and sediment records are used, but these have much lower spatial and temporal resolution and require assumptions and calibration based on limited overlapping instrumental data.
Statistical constructs: The published 1850-1900 global temperature is a statistical product derived from patchy data and complex interpolation methods. It’s less an empirical measure of real global temperature at that time and more a constructed approximation heavily influenced by available station locations and data processing choices.
Large, underappreciated uncertainty: Critics highlight that the official error ranges (~±0.1 to ±0.2 °C) are unrealistically tight given data scarcity, suggesting true uncertainties likely exceed ±0.3 °C or more, reinforcing that the figure does not robustly represent true global climate.
Limited scientific utility: Because the figure conflates limited observation coverage, measurement method errors, and natural variability, it speaks more to data availability than actual global temperature conditions and should be interpreted cautiously with recognition of its many caveats.
The so-called official mean temperature for 1850-1900 tells more about where and how temperatures were recorded than about the Earth’s true global temperature. The baseline is best understood as a highly uncertain, approximate reference point rather than a precise or reliable depiction of pre-industrial global climate. This complicates scientific conclusions drawn from it and calls for caution in its use for climate comparisons.
Indeed. And it is so obvious. But that doesnt stop the people forcing the issue by claiming their mathmatical equations are right. I dont think they actually care as they seem to only be interested in a desired outcome whereby anything goes.
They have a target to aim at. A forced narrative.
It is anti science.
H/T to “The Cause” (referenced in the Climategate e-mails).
once a “scientist” adopts a “cause,” they cease to BE a “scientist.”
“The accepted mean global temperature for 1850-1900 is widely regarded by critics as a “hot mess” — a figure heavily influenced by the limited and uneven network of reporting stations, and fraught with so many uncertainties that it essentially reflects station coverage more than actual global climate conditions.”
I don’t think that is correct.
Despite all the problems you find with the historical temperature record, one thing that can be said is that the written temperature records from around the world are strikingly consistent with each other.
All the written records show the Earth’s climate is cyclical in nature, where, since the end of the Little Ice Age around 1850, the temperatures have warmed for a few decades and then they have cooled for a few decades and the process repeats, to this day.
The Temperature Data Mannipulators took the written records and distorted them (the Hockey Stick global chart) to downplay past warm periods in an effort to make it appear that the temperatures are getting hotter and hotter and hotter (because CO2) and today is the hottest time in human history.
This is all a Big Lie.
The historic, written temperature records do not show this kind of temperature profile. None of the written temperature records resembles a “hotter and hotter and hotter” Hockey Stick chart temperature profile.
So where do the Temperature Data Mannipulators get this Hockey Stick temperature profile if it doesn’t exist in reality?
We know where they got it: They made it all up out of thin air to promote a political agenda for fun and profit.
Here is your evidence. None of these charts, from around the world resemble the bogus, bastardized Hockey Stick chart profile.
https://notrickszone.com/600-non-warming-graphs-1/
We’ve been lied to Big Time, people. We’re still being lied to. The Hockey Stick global chart is science fiction. It doesn’t exist in reality. It is a made-up artifact. It is a BIG LIE that has cost humanity dearly.
“the mean temperature of the world at the moment (early November) is hovering around 14 deg C, which is never used because it does not convey a sufficient element of danger in the global warming message
Yeah“
When the IPCC says that we must limit “warming” to 1.5°C they’re being deceptive because, contrary to the common English usage of the word “warming,” what they call 1.5 °C of “warming” does not actually mean “getting 1.5 °C warmer.” It means only, very roughly, 0.3 to 0.5°C of actual warming, since the baseline temperature which they use is a guesstimate of average “preindustrial” (late Little Ice Age) temperatures, instead of the current climate optimum.
They do that because if they used a normal English definition of “warming” (i.e., “getting warmer”) they’d have to say that they think more than “0.5°C of warming” (or even less) would be catastrophic, and anyone with any sense would probably laugh out loud.
Of course, since, as you ably demonstrate, we don’t actually know what preindustrial temperatures were, using that as the baseline means that their target is ill-defined. But that’s okay, they’re happy to sacrifice precision for the sake of propaganda, because the real purpose is just to support parasitic climate industries:

The diabolically inadequate probity and provenance standards of temps “data” used in climate “science” should be called out by all serious, dedicated, honest scientists all around the world.
Why is this not happening?
Salaries, grants and all-expenses-paid conferences in exotic places?
Good post.
The same general lack of early land temperature coverage problem has been made many times before by many others. And ocean SST is worse, as the 2008 Climategate emails proved—and the ‘climate scientists’ at East Anglia well knew.
The same call out of alarmist measurement PseudoPrecision (not only concerning temperature) has also been made before. The present extreme is NASA’s claimed satalt SLR precision to 0.1mm/year when their own stated intrinsic satalt accuracy is between 3 and 4mm depending on satellite. Covered here long ago concerning Jason 3 (at best 3.8cm)and then Sentinel 6 (about 3.6cm). About three orders of magnitude of false PseudoPrecision.
Makes no difference to alarmist true believers. Hasn’t for decades. Never will.
See my post post to Nick S. below, For some reason I missed it that you brought up ocean temperatures.
” two unanswered questions”
Pretty easy to find out. The best source of original data is GHCN V4 monthly.
“incredibly accurate”
No, you have gone to the actual computer output. Global average is a calculation, and so the computer will output full floating point. Thses are not numbers normally presented as data.
But the anomaly average is known to about 1 dp accuracy. The MO and others do it by some variant of calculating grid averages and infilling where necessary using neighborhood information. It works because, unlike raw temperatures, anomalies are homogeneous over these distances. The short range variation due to topography etc has been eliminated.
“…numbers not normally presented…” Here’s a gem from the IPCC:
IPCC AR4 Chapter 5 page 387 pdf4
[Executive Summary Opening statement]
The oceans are warming. Over the period 1961 to 2003,
global ocean temperature has risen by 0.10°C from the
surface to a depth of 700 m.
That extra “0” says it’s not 0.09° or 0.11° C. but is 0.01°C
Such precision over 42 years is truly remarkable.
Ocean temperature anomalies are more accurately known, as they are even more homogeneous than air temp.
You’re right about that but, So what? What does that have to do with claiming the temperature of the entire global ocean over 42 years was known to within 1/100th of a degree Celsius.
Yeah, the Climate Alarmists act like the world ocean is just a big bath tub whose temperature is slowly rising continuously, because CO2.
The truth is the oceans are just like the atmosphere: There are warm spots and there are cool spots and these conditions continuously change, going from cool to warm, and from warm to cool.
So claiming a uniform temperature for the world’s oceans is ridiculous and cannot be backed up by measurements.
The Climate Alarmists should explain why there is a “cool blob” in the Atlantic, that obviously is cooler than the surrounding ocean.
There are lots of cool areas in the world’s oceans. It is not a uniform temperature.
They may change more slowly, but the fact that there are almost no sensors covering 90% of the ocean especially prior to the Argus probes, gives the lie to your beliefs.
And even then their separation means it is not global. 😉
Not from 1960 they are not.
There are no reliable consistent measurements for most of the ocean.
It is all FAKE. !!.
Not unlike the historical pH measurements that were replaced by a computer model.
Really? Have you ever heard of Lorenz, convection, chaos or fluid dynamics?
I suppose you might even be ignorant and gullible enough to think that the deep ocean is heated by sunlight!
It’s not. Try making a comment that is relevant, for a change – and not just a product of your imagination.
Ocean currents can give difference of 3C or more in different areas over very short periods.
How are ocean temperature anomalies more accurately known when we weren’t even making any ocean temperature measurements?
Nick, show us the uncertainty calculation for a controlled water bath in a lab. Do you think that water bath has a homogeneous temperature throughout?
Except for the abrupt temperature change at the variable depth thermocline, and eddies surrounding the meandering, sinuous currents like the gulf stream, the anomalous, actual heating of shallow coastal waters, and the surface cooling where there is coastal and equatorial up-welling from the abyssal plains, and high evaporation rates in the tropics, and surface cooling where tropical squalls and hurricanes dump huge amounts of water. Other than that, the oceans are quite homogeneous.
Especially since in 1961, there were precious few sensors on the ocean’s surface, and almost none 700 meters down.
GHCN V4 is NOT original data.. Much of it has been “adjusted” by cooling the past, often by a whole degree or more.
The average anomaly IS NOT known to 1 decimal place. That is a mathematical impossibility promulgated by the climate scammers.
You CANNOT remove distance difference by homogenising surface data which is totally corrupted from the start.
GHCN V4 unadjusted is, as it says, raw data.
Wrong.. You can go to any GHCN data and see the original vs “adjusted data.
eg
That’s what the Climate Alarmists did to the global temperature chart, too.
They artificially cooled the past to make the present look like the hottest time in human history.
They knew they were lying when they did this. They did it on purpose for political/personal gain.
Their lies have devastated the world.
Yes, of course. GHCN posts adjusted and unadjusted data. The unadjusted data is unadjusted, an the adjusted is adjusted. They are different, just as your graph shows.
and another one..
Everyone has seen many such “adjustments” of original data..
except , apparently.. . Nick !!!
This also shows an apparent 50-60 year cycle of heating and cooling.
Start – 1920 warming
1920 – 1980 cooling
1980 – Current warming
IF the cycle repeats warming until 2040 then return to cooling
“There are none so blind as those who will not see.”
And this shows that adding CO2 to air makes thermometers hotter, does it?
Geez, Nick, you really are scrabbling for relevance aren’t you?
Yes, surface temperatures 90 C maximum, -90 C minimum, 70% of the surface covered by water – “calculation” by “climate scientists” definitely needed.
Go on, Nick, convince me you’re not completely delusional. Tell me that you don’t really believe that adding CO2 to air makes thermometers hotter. What’s the matter – cat got your tongue?
Sheesh.
Even worse is claiming H2O feedback which means more heat is hidden from thermometers due to latent heat.
Bullcrap. No wonder you never show resources for your assertions.
Up until about 1980, LIG readings were recorded as °F integers in the U.S. That gives ±0.5°F. I have not done the research, but I suspect that °C field readings were no better than ±0.5 °C.
Both of these result in integer temperatures.
ASOS readings are rounded to the nearest °F before conversion to °C. NOAA also shows a ±1.8 °F uncertainty for ASOS.
NASA also shows CRN temperatures with an uncertainty of ±0.3°C.
Following proper science rules for significant digits does not allow anomalies to have uncertainty that is better than initial measurements.
FFS you talk some real bollocks. This is what happens in the “real” world of the Met Office
.https://tallbloke.wordpress.com/2025/09/29/wye-2nd-addendum-how-round-up-kills-data-accuracy/
Sea level seems to suffer from a similar problem. Tide Gauges and Satellite comparisons. It seems that the climate crusaders favor the satellite data over tide gauges.
Besides that, there’s this interesting series of titles from R. Steve Nerem of The University of Colorado:
Why has an acceleration of sea level rise not been observed during the altimeter era? 2011
Is the detection of accelerated sea level rise imminent? 2016
Climate-change–driven accelerated sea-level rise detected in the altimeter era 2018
Yes, for climate crusaders the satellite altimetry data has the advantage of extreme malleability.
Note: Steve contributed significantly to that webpage. This is his work, too:
So is this:
If you torture the satellite altimetry data enough it will confess to whatever you want it to say. That’s handy if you want to frighten people. (Best not mention that satellite altimetry can only measure sea-level far from the coast, where it doesn’t matter.)
Thanks for all of that. The LINK you provided shows a graph where the 1992 – 1998 data was manipulated resulting in acceleration of sea level rise. I failed to Credit Dr. Judith Curry for pointing that out. Or was it Jo Nova? At any rate, it wasn’t me. I had predicted here on WUWT that because Dr. Nerem hadn’t updated the web page for nearly two years that when he does, “Look for some big changes to the data” And there were, but I missed the 1992-1998 manipulation.
Indeed. I find it odd that people trust satellite ‘data’ more than actual tide gauge measurements.
But you state why. You cant manipulate tide gauges.
But i understand the need for global sea levels. However, the actual local level is more accurate but…messy. as it def does NOT give you an average. Land may fall etc.
I think people are obsessed w (global) averages. Often w an agenda in mind. And you can hide local anomalies.
And more to the point, altimetry data has an error range that is in CENTIMETERS, as it attempts to “measure” a rate of change measured in MILLIMETERS.
IOW it is a joke.
Kind of like trying to measure the lengths of 2x4s to the inch with a ruler marked only in feet.
“There is even more accurate Met Office data from the past…”
Pity when the author was copying the figures from the data files he missed the columns showing the upper and lower confidence bounds. Nobody is claiming that they know the global temperature to 100 millionth of a degree. They explicitly give an uncertainty of around ±0.18°C for 1850.
Why they don’t round the data file down to a few decimal places like most other data sets do, I couldn’t say. But it really makes no difference if all you are doing is downloading and analysing the data, and I would assume anyone reading the values would have enough gumption to do their own rounding.
The “confidence” interval of Met Data is “zero confidence”
There is absolutely ZERO possibility of calculating the “global” temperature of 1850 to any accuracy at all, let alone 0.18C.. That is totally laughable.
There is no data for MOST of the planet !!
Show us where the oceans were measured on a daily, in an accurate way, on a daily or even monthly basis in 1850. !!
Even the land coverage was basically non-existent for most of the land surface
Bellend and Stokes aren’t interested in real, measured data. They just know.
This is the way the Met Office treats real world data.
https://tallbloke.wordpress.com/2025/09/29/wye-2nd-addendum-how-round-up-kills-data-accuracy/
The rounding doesnt matter. The main number itself is based on wrong assumptions and assertions. It is forced.
The problem with carrying more apparent significant figures than are justified is that many people are impressed with ‘authority’ and it makes it too inviting to use such data for propaganda, implying that the data are known to greater precision than can possibly be justified by an objective observer.
How many people, wanting to know the current global temperature, will first look at a CSV file supplied by HadCRUT, ignore the upper an lower bound columns, but assume that a 6 digit value means that HadCRUT know the global anomaly to a millionth of a degree?
If I just wanted to know the HadCRUT anomaly, I’d go to their dashboard and see the monthly anomaly listed as 1.39±0.06°C. More likely I’d just look at the graph.
https://climate.metoffice.cloud/dashboard.html
Here is a revealing reminder of the apparent cavalier attitude to what is laughingly called “data” by esteemed past members of climate science hierarchy as exposed in the ‘climategate’ emails and who were largely responsible for the anomaly charts we are stuck with today, Tom Wigley to Phil Jones both past directors of the Climate Research Unit (CRU) University of East Anglia:
‘Here are some speculations on correcting sea temperatures to partly explain the 1940s warming blip.
If you look at the attached plot you will see that the land also shows the 1940s blip (as I’m sure you know). So, if we could reduce the ocean blip by, say, 0.15 degrees Celsius, then this would be significant for the global average—but we’d still have to explain the land blip.
I’ve chosen 0.15 degrees Celsius here deliberately. This still leaves an ocean blip, and I think one needs to have some form of ocean blip to explain the land blip … It would be good to remove at least part of the 1940s blip, but we are still left with “why the blip?”‘ (September 28, 2009: email 1254147614).
Sources:
“It would be good to remove at least part of the 1940s blip”
The no-good bastards!
The cause of all our climate change problems today: Outright Lies about the temperature record.
These Liars have caused untold damage to the human community.
You see, it was just as warm in the Early Twentieth Century as it is today, according to the written historic records. And this was global as attested to by the numerous written temperature records from around the world.
But if it was just as warm in the recent past, with less CO2 in the air, as it is today, with more CO2 in the air, then the Climate Alarmists have nothing to scare us with over CO2, as CO2 appears to have had little effect on temperatures.
So to change this perception, the Climate Alarmists had to change the temperature profile. They changed it from a benign, cyclical temperature profile, where it warms for a few decades and then cools for a few decades, and repeats this process, with temperatures staying within about a 2.0C range from high temperature to low temperature.
The Temperature Data Mannipulators distorted the temperature profile (see above) to erase the previous warm periods, to make the temperature profile appear to be getting hotter and hotter and hotter, for decade after decade, and today is the hottest time in human history, in order to advance their theory that CO2 is a dangerous gas that will overheat the world.
So, now, just about all of climate science is living in a false reality, believing that the bogus, bastardized Hockey Stick chart is a real depiction of the Earth’s temperature profile.
Such a BIG LIE.
Phil Jones and his cronies should be required to explain themselves for bastardizing the written temperature records. The last thing Phil Jones said about it was he would not share his data because if he did someone might find something wrong with it. This is not science.
At present, surface temperature extremes vary between about 90 C and -90 C.
The “mean” temperature is a “climate science” fiction, rather like claiming “climate” is responsible for weather events.
Which just goes to show that “climate scientists” are even more ignorant and gullible than the average dimwit. They should know better.
Even ignorant and gullible enough to believe that adding CO2 to air makes thermometers hotter! What real scientist would believe that?
More correctly: At present, surface temperatures vary between about… Because it is ALWAYS the hottest time of day somewhere, and the coldest somewhere else!
8 digit accuracy is simply a product of averages division for example…
.10
.20
.15
.25
.20
.15
.10=
1.15
1.15/7 = 0.16428571428
It is also shows that either the scientists in charge or the programmers (if not both) don’t know what they are doing.
Did you know that if you divide a whole number by 97, you get 96 recurring digits !
Now that is accuracy 😉
….. but if you divide a whole number by 100, you only get 2 decimal places at best. 🙁
Math is a truly wonderful thing
Your illustration shows why high-caliber scientists will truncate their reported calculations to the same precision as the input data. Thus, what should be reported is 0.16 +/- the in-quadrature uncertainty inherited from the input data.
So True
Might I add a question?
If 1950 was the year that CO2 emissions began to be significant (their year not mine), why would they choose 1850 to 1900 as the reference period?
The answer is obvious to anyone who things about it for a moment, I’m calling it out because so many people haven’t.
“If 1950 was the year that CO2 emissions began to be significant”
It wasn’t. In 1950 CO2 was at about 310 ppm; over 10% higher than 1850.
And thank goodness for that.
310ppm is barely above subsistence level..
Plant life needs FAR MORE atmospheric CO2 than that..
And as you are well aware, CO2 has zero measured effect on the climate.
The slight rise, combined with the slight rise in atmospheric temperature from the coldest period in 10,000 years, due to solar input….
… has allowed the plants and the planet to prosper.
“The slight rise, combined with the slight rise in atmospheric temperature from the coldest period in 10,000 years, due to solar input….”
I thought it was El Nino what done it? Make up your mind……
Almost. From ice cores, the CO2 level in 1850 was about 285.2 ppmv, and in 1950 it was about 311.3 ppmv (source). That’s a 26.1 ppmv (9.2%) increase.
You cannot derive CO2 levels from ice cores.
Correction: you CAN but it is always an average within a time frame. That timeframe is usually rather wide. More to the point: NOBODY can derive CO2 levels for 1850 from ice cores. Or 1900 for that matter.
It is not scientific.
From 280 (background) to 310 in 1950 is a difference of 30 ppm spread out over 100 years. That’s 0.3000000 (8 digits there since you’re OK with that) ppm/year. It is all over various reports that acceleration of emissions became significant starting in 1950.
Nice try, but 10% over a century doesn’t cut it. If emissions that low were in fact significant, then may I point out that at 3.5 ppm/year we’re now at a rate 11.666666 TIMES as much per year. Yet, per the IPCC there is no discernable global trend in hurricanes, droughts, floods or wildfires, which means 3.5 ppm/year isn’t very significant at all and 0.3 ppm/year is therefor miniscule.
1950 as the year acceleration of emissions became significant has been cited over and over and over, and any attempt to make 0.3 ppm/year “significant” is just typical alarmist bait and switch.
Here is a graph of cumulative emissions and mass of C in the atmosphere; mass C relates to ppm by the proportionality shown. Both excess ppm and cumulative emissions are pretty much exponential. There is nothing special about 1950.
The GHE responds to the actual ppm, not the rate of change.
The GHE responds to the actual ppm, not the rate of change.
Explain previous warm periods (MWP, Holocene etc) then, when CO2 was at 300ppm or below.
Another bait and switch.
If the GHE in fact responds to the actual ppm, then no change in ppm means no change in GHE. A small change in ppm means a small change in GHE. Do you wish to argue that had CO2 increases remained at 0.30 ppm we’d even be having this discussion?
Your own graph shows the significant acceleration of emissions around 1950 +/- 3.6279856 years. No acceleration no discussion to be had.
Cumulative emissions leaves out the effect of sinks. The cumulative emissions is going to be much larger than the net concentration.
And? You are one of those deluded people who believe that adding CO2 to air makes thermometers hotter, aren’t you?
Keep babbling irrelevancies.
Good questions from this article:
“Where can we consult the actual original global data?:
“How were those incredibly accurate anomaly figures calculated?”
At this url: https://berkeleyearth.org/press-release-2024-sets-another-all-time-temperature-record/ it states:
“Berkeley Earth’s proprietary data processing approach allows for the incorporation of more temperature observations for more locations, resulting in a temperature data set with a higher spatial resolution than others of its kind.”
I am skeptical how proprietary data processing at Berkeley Earth can “co-exist” with open and transparent data that Berkeley Earth seems to “champion”.
Any comments are welcome.Thanks.
Fudging, faking, and fiddling – but of course, Berkeley Earth (hoping you will think they are associated with UC Berkeley – they aren’t) claim they are “scientific”.
They believe that adding CO2 to air makes thermometers hotter! Fools? Frauds?
I really don’t know.
One of the two, I would say.
Berkeley Earth distorts the temperature profile of the Earth just the way Phil Jones distorted the temperature profile.
There is no “hotter and hotter and hotter” Hockey Stick global temperature profile in the written, historic temperature records, and there is no record at all of sea surface temperatures, so how did Berkeley Earth and Phil Jones manage to get a hotter and hotter and hotter temperature profile out of data that has no such temperature profile?
In Phil Jones case, he just made up sea surface temperatures, and either Berkeley Earth did the same, of they are using Phil Jones’ data.
There is no Hockey Stick chart “hotter and hotter and hotter” temperature profile in the written, historic records. Their temperature profiles look nothing like a Hockey Stick. There is only one logical conclusion to reach: These Climate Alarmists made all this stuff up out of whole cloth, because there isn’t any data to support it.
Correction –
Phil Jones et al didn’t have any “data” in the proper sense.
Numerical constructs aren’t DATA.
“Proprietary” sounds a lot like the
excuse given by Phil Jones. Anyone who know how we did it is liable to find a problem
The answer is wrong.
Global scale observations from the instrumental era was patchy at best with vast areas of the southern hemisphere with barely any temperature records until the 1920s or later.
That’s the real reason they chose the 1850-1900 global average.
Indeed. And it opens the door to…’calibration’ in extremis.
I was intending to send those unanswered questions to BBC Verify department which, as you all know is the unchallenged authority on all things political and scientific. Unfortunately they are having a few internal issues at the moment. I then considered asking the Met Office which as you all know, has the longest uninterrupted record of temperature data for large area of the World called England. They were unable to help because the computer simulations they now rely on don’t go back into the 19th century for some reason. Not enough electrons apparently?
I am thinking of trying Australia for the answer. They have something called dream time and even deep dream time to call upon for answers. I will keep you posted.
I am thinking of trying Australia for the answer.
No use. The BOM declared all official temperatures prior to 1910 “unreliable”.
A cynic might say this was done to make the 1896-97 heatwave (which officially killed at least 436 people in the Eastern states) vanish from the official record.
Replace “unreliable” by “inconvenient” and you’ll be much closer to the truth.
Good point.
Those previous warm periods are very problematic for Climate Alarmists. They want to pretend they didn’t exist. It messes up their Climate Scare narrative.
Measuring the temperature of the Earth is a ridiculous concept, especially before the arrival of satellites.
And those satellites work by proxies and calibrations as below 5km the signal gets..well, messy. Estimations cannot beat observations. It is good to have satellites but you cannot base your science on it.
The issue remains the actual (near) surface temperature measurements and the anomalies throughout history.
Measuring the temperature of the Earth is a ridiculous concept. There is no temperature of the whole. There surely are different climates. The average is meaningless. There just ain’t no such thing in the first place.
Accurate temperature trends for a given location are a different story altogether. We expect trends to follow patterns. Nature often operates in cycles, 11 years for sunspots. Local climates, too. The growing season back home is longer than when I was a kid. Climates cycle.
The fact that I can’t find *any* report of significant changes in hardiness zones belies the claim of catastrophic climate change. You would thing that CAGW would be affecting hardiness zones *somewhere* in North America but I’ll be darned if I know where.
If someone asks what the average global temperature and range is, do you think just saying “We don’t know” will suffice? Sometimes even just having an order of magnitude estimate can be useful
The only way you can establish a difference is by using the scientific method whereby both the conditions and the ways of measuring the parameters are the same.
So what we can establish is: that is not the case when looking at (near) surface temperature.
End of experiment…or,
Second best: come as close as you can to parity using as many similarities as possible like more or less the same temperature stations under the same conditions, say rural ones untainted by urbanisation. Then consider the amount of temperature measuring points in regards to the difference between say hourly thermometer readings and 5 min electronic S screen readings filtering out the lows and especially highs.
I know, it’s messy. But keep it local as you cannot establish global means.
Then you can look at 1979 as a starting point w satellites and all the other measuring stations doing the same thing.
But, be sure to remember that each satellite that is sent up has its instruments “calibrated” to read, as closely as possible, the same “temperature” that the thermometers on the ground read.
In my experience people who produce numbers with improbable accuracy are either dilettantes who have no idea about the actual accuracy of the measurements or complete crackpots.
Here’s the closest I’ve been able to find to original data: https://archive.org/details/worldweatherreco0001clay
i would like to give two examples of the crazy botching of data by the UK Met Office – there are lots more but for now.
https://tallbloke.wordpress.com/2025/09/29/wye-2nd-addendum-how-round-up-kills-data-accuracy/
Then consider this which will be followed up soon with 4 year comparisons showing massive error bars (double digit celsius – yes really)
https://tallbloke.wordpress.com/2025/09/26/rothamsted-addendum-a-study-into-rising-mean-minimum-temperatures-using-comparative-data/
From the U.S. ASOS manual.
Sound familiar?
Thanks for that Jim, I shall refer to that in future posts over on the Talkshop. And yes very familiar!