Earlier this month, the Met Office claimed that climate change was causing a “dramatic increase in the frequency of temperature extremes and number of temperature records in the U.K.”. Given what we now know from recent freedom of information (FOI) revelations about the state of its ‘junk’ nationwide temperature measuring network, it is difficult to see how the Met Office can publish such a statement and keep a straight face.
The claims were the headline findings in the operation’s latest state of the U.K. climate report and are said to be based on “observations from the U.K.’s network of weather stations, using data extending back to the 19th Century to provide long term context”. That would be the network where nearly eight out of 10 stations are deemed by the World Meteorological Organisation (WMO) to have ‘uncertainties’ – i.e., potential errors – between 2-5°C. The same junk stations that provide ‘record’ daily temperatures often in the same places, such as the urban heat furnace that is Heathrow airport. The same junk measurements that the Met Office uses to claim collated measurements down to one hundredth of a degree centigrade.
The WMO rates weather stations by the degree of nearby unnatural or natural temperature corruption. Classes 4 and 5 have possible corruptions of 2°C and 5°C respectively and these account for the vast majority of the Met Office sites. The WMO suggests that Class 5 should not be used to provide an accurate measurement of nearby air temperatures, yet nearly a third of the Met Office sites are classified in this super-junk category. Only classes 1 and 2 have no uncertainties attached and only these should be used for serious scientific observational work. But, inexplicably, the Met Office has very few such uncorrupted sites. Even more worryingly, it seems to show no sign of significantly increasing the paltry number of pristine sites.
Human-caused and urban heat encroachment are the problems, with extreme cases found at airports, which can add many degrees of warming to the overall record. But this has been known for some time, and it is a mystery why the Met Office has not done anything about it. Recent FOI disclosures reveal that over eight in 10 of the 113 stations opened in the last 30 years are in junk classes 4 and 5. Worse, 81% of stations started in the last 10 years are junk, as are eight of the 13 new sites in the last five years.
It’s almost as if the Met Office is actively seeking higher readings to feed into its constant catastrophisation of weather in the interests of Net Zero promotion. Whatever the reason – incompetence or political messaging – serious science would appear to be the loser. As currently set up, the Met Office network is incapable of providing a realistic guide to natural air temperatures across the U.K. Using the data to help calculate global temperatures is equally problematic.
Of course, the Met Office can rely on its helpful messengers in the mainstream media not to breath a word about this growing scientific scandal. The central plank of Net Zero fear-mongering is rising temperatures and claims that ‘extreme’ weather is increasing as a result. Temperatures have risen a bit over the last 200 years since the lifting of the mini ice age, the clue to the pleasant bounce being obvious to all. But this is not enough to force the insanity of Net Zero on humanity, so fanciful climate models and bloated temperature databases are also required. The compliant media are uninterested, but the cynicism and outright derision over the Met Office’s temperature antics are growing. The Met Office regularly posts on X and it cannot be unaware that a growing number of replies are less than complimentary. Last week, it announced the “warmest day of the year” based on measurements taken at Heathrow. The following are a few of the more polite comments it received:
What is it about LHR that could make it hotter than surrounding areas? I will give you a clue – concrete and hot jet exhausts maybe?
Real temperatures should be taken out in the open away from London.
…manipulating temperatures to fit the climate agenda.
Might as well measure inside an oven.
It’s all made up to fit your agenda.
I have a brighter red highlight in my fonts that I can lend you if you think the one you choose does not does not push the propaganda enough!
Remind us where you were taking temperature recordings in the last century, because it wasn’t on the roasting tarmac of airports.
Urban heat islands should not count and you know it but the grift continues.
In its recent annual report, the Met Office claimed that “our new analysis of these observations really shines a light on the fastest changing aspects of our weather as a consequence of climate change”. It is not just temperature data that is brought to the Net Zero table, but rainfall as well. The indefatigable investigative journalist Paul Homewood took a look at how the Met Office spun precipitation in a recent article in the Daily Sceptic. He agreed with the Met Office’s claim that rainfall has risen since 1961, but asked why that year was chosen to start the timeline. The graph below shows why.
England and Wales are rainy countries, but their island position in the North Atlantic leads to regular seasonal, yearly and longer-term decadal variations. The year 1961 fell within a drier interlude, and current totals are similar to those around the 1930s, 1880s and 1780s.
Helped by the widespread availability of satellite images and measurements, the Met Office does an excellent job in forecasting short-term weather and is of great benefit to shipping, the military, agriculture and the general population. But the state body funded by over £100 million a year is clearly riddled with green activists who, on the evidence that a number of sceptical journalists have presented, are using unreliable figures, carefully-curated statistics and inaccurate measurements to promote their own attachment to the insanity of hydrocarbon elimination.
Chris Morrison is the Daily Sceptic’s Environment Editor
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.


It only needs a fence in the bushland to create a heat island. Just anything that will block the natural flow of air across the surface.
Official weather recording stations should be free of all structures. Not to mention heat-generating capacity.
–Breaking News—
A recent study by a group of renowned climate scientists suggests — Demolishing all large cities and urban areas with populations >5million people will help stop runaway global temperature increases.
The study found that………..
Which landfill would be used?
Would all the deconstruction equipment, explosives, etc., be EV and/or carbon free?
Questions. No valid answers.
Professions that care about their integrity do not have agendas of any kind. The Met Office has a desire to promote the dangers of “global heating” in spite of the widely available evidence of the Holocene when the populations of the British Isles lived and thrived and there was plenty with temperatures well above those we have now.
The lack of care the Met Office takes with its measuring equipment tells us, especially where ‘records’ are concerned, that the Met Office do not want to be seen as doing anything other than following the political scam (CAGW) without any scientific meteorological justification at all. Indeed the whole CAGW scam was triggered by a ‘trick’ at UEA which just happens to be located in one of the warmest areas in the British Isles if not the warmest. But the Met Office likes the easy money that comes when civil servants follow political prognostications (aka agendas) and don’t want to be cancelled. What an upside down place the UK has allowed itself to become and it will only get worse the weaker people are.
And where, oh where, is the proof carbon dioxide is causing warming? Nowhere is where it is since it doesn’t exist and people need to start asking questions as to why they are being constantly lied to.
With idiots like Miliband in charge of Labour’s Net Zero what could possibly go wrong?
This has been covered by Homewood with several posts over the last 2 weeks – but hey ho.
“The same junk stations that provide ‘record’ daily temperatures often in the same places, such as the urban heat furnace that is Heathrow airport.”
And
“Last week, it announced the “warmest day of the year” based on measurements taken at Heathrow.”
No it was Cambridge.
Heathrow was 2C cooler
“Monday has been declared the hottest day of the year in the UK so far, reaching 34.8C (95F) in Cambridge, according to the Met Office.”
https://wow.metoffice.gov.uk/observations/details/?site_id=27484233
Heathrow Highest 12 August, 2024
32.8 °C
“But, inexplicably, the Met Office has very few such uncorrupted sites. Even more worryingly, it seems to show no sign of significantly increasing the paltry number of pristine sites.”
No, not “inexplicably” but necessarily, as is explained by the UKMO:
https://www.metoffice.gov.uk/weather/learn-about/how-forecasts-are-made/observations/observation-site-classification
“WMO Siting Classifications were designed with reference to a wide range of global environments and the higher classes can be difficult to achieve in the more-densely populated and higher latitude UK. For example, the criteria for a Class 1 rating for temperature suits wide open flat areas with little or no human influenced land use and high amounts of continuous sunshine reaching the screen all year around, however, these conditions are relatively rare in the UK. Mid and higher latitude sites will, additionally, receive more shading from low sun angles than some other stations globally, so shading will most commonly result in a higher CIMO classification – most Stevenson Screens in the UK are class 3 or 4 for temperature as a result but continue to produce valid high-quality data. WMO guidance does, in fact, not preclude use of Class 5 temperature sites – the WMO classification simply informs the data user of the geographical scale of a site’s representativity of the surrounding environment – the smaller the siting class, the higher the representativeness of the measurement for a wide area……”
” He agreed with the Met Office’s claim that rainfall has risen since 1961, but asked why that year was chosen to start the timeline. The graph below shows why.”
And he was given an answer on here:
“Why is the period 1961 to 1990 used for the long-term averages? The 30-year period 1961 to 1990 has been designated as the international standard reference period for climate averages by the World Meteorological Organization.”
The UK MetO is responsible for the, well UK.
That includes Scotland
“England and Wales are rainy countries, but their island position in the North Atlantic leads to regular seasonal, yearly and longer-term decadal variations. The year 1961 fell within a drier interlude, and current totals are similar to those around the 1930s, 1880s and 1780s”
The UKMO is saying that there is a greater frequency of more extreme rainfall events.
NOT that there is more rainfall overall (tho there is).
From MetO climate report:
“The most recent decade has had around 20% more days of exceptional rainfall compared to the 1961-1990 averaging period. While there is no significant signal for this change being more pronounced in a specific area of the UK, overall, this analysis clearly shows an increase in the number of very wet days in the UK’s climate in recent years compared to what was observed just a few decades ago.
Western/northern higher ground, where orographic influences are at play – that is where most events come from frontal rainfall events (not convective events) Hence missing off Scotland removes the part of the UK that catches most extreme prolonged rainfall from the Atlantic and enhanced by the Scottish mountains.
And at the 10mm rainfall event mark (which is a much more realistic amount to consider as it lies in the upper end of events but not the extreme).
Everything you just typed is from surface sites that are basically UNFIT-FOR -PURPOSE.
“most Stevenson Screens in the UK are class 3 or 4 for temperature as a result but continue to produce valid high-quality data”
Utter BS !!
By their very definition, they produce data which could be as much as 4 to 5C higher than they would have produced in the past…
” He agreed with the Met Office’s claim that rainfall has risen since 1961, but asked why that year was chosen to start the timeline. The graph below shows why.”
And he was given an answer on here:
“Why is the period 1961 to 1990 used for the long-term averages? The 30-year period 1961 to 1990 has been designated as the international standard reference period for climate averages by the World Meteorological Organization.”
That’s the answer to a different question.
You can say that again.
That’s the answer to a different question.
(I did it for you).
“No it was Cambridge.”….
Here is Cambridge weather station.. how much higher than Class 5 do they go ?????
They need to go to 11
Cambridge Weather Station
Oh, and from what you have posted, Met Orifice is admitting that they are measuring mostly URBAN temperatures.
Great to have you confirming that they know they are using almost universally JUNK data…
… and that they are well aware of that fact.
The temperature sensor at this site is under the metal walkway close to what appears to be a asphalt roof, between two low walls no wonder it was the hotspot, it couldn’t fail to be.
““WMO Siting Classifications were designed with reference to a wide range of global environments and the higher classes can be difficult to achieve in the more-densely populated and higher latitude UK”
Difficult is not impossible.
If the network is not fit for purpose then that should be recognized by the Met and their results should only show precision compatible with the measurement uncertainty associated with the network as it stands.
With the majority of the network having a base measurement uncertainty of +/-2C to +/- 2.5C temperatures shouldn’t actually be quoted beyond the units digit and even then probably should be rounded to the nearest 5C (i.e. shown as either 0 or 5 in the unit digit).
Averages of these temperatures across the network shouldn’t go past the tens digit. You can’t increase resolution or accuracy through averaging. Both just get worse as you add data, measurement uncertainty is additive.
The meme that measurement uncertainty is random, Gaussian, and cancels is so pervasive in climate science that it is unbelievable and ridiculous.
Here is the WMO’s description of a Class 5 site
“a site where nearby obstacles create an inappropriate environment for a meteorological measurement that is intended to be representative of a wide area”
Almost 80% of the Met Office sites are in class 4 and 5.
Precisely.
The question for Anthony is whether, given this, he thinks that the Met network is nevertheless able to supply “meteorological measurement representative of a wide area”.
And if he does think this , why?
LOL!
The “reference period” is for use in determining anomalies. It is NOT the point where trends should begin! Make another guess.
What a howler! And Banton pompously claims to be a Meteorologist!
So are Class 2 -5 stations there just for participation certificates?
Anthony,
Are you claiming that the Met Office network is fit to determine whether or not there has been a “dramatic increase in the frequency of temperature extremes and number of temperature records in the U.K.” in some specific period?
I am not sure from this quote exactly what the claim is, other than that we should be alarmed. But take some reasonable quantified version of it, for instance that there have been twice as many temperature extremes and records in the last ten years than in any previous ten year period. Whatever an ‘extreme’ is. Or whatever claim they want to make.
Are you saying the Met Office network is of adequate quality that we can confidently use it to verify such a claim?
This continues the usual theme of pretending there is no warming anywhere because surface stations can’t be trusted and it’s all down to UHI.
Ignoring, as usual, that the UAH lower troposphere data set, which is prominently featured on this site, has shown a faster warming trend than the Met Office HadCET, NOAA and NASA/GISS surface data sets over the past 20-years.
If UAH is the ‘gold standard’ then this would suggest that the adjustments made to the surface data have had a cooling, not a warming, effect over the past 20-years.
Again with your ignorant BS.
The only reason the UAH data in the last 20 years shows warming is because of the impact of 2 MAJOR El Nino events.
El Ninos have a greater effect on the atmosphere, because they are a release of energy direct to the atmosphere.
There has been no actual HUMAN CAUSED WARMING in the last 20 years of UAH data.
(or the whole UAH record , for that matter.)
If you believe there is.. then show it to us. !
To be fair, TheFinalNail did not use the word “human” in his post.
Your post continues the usual theme of contesting a statement that was not made, thus :
“pretending there is no warming anywhere because surface stations can’t be trusted and it’s all down to UHI”
Which is a powerful even overwhelming response if that is what was said. But :
“Temperatures have risen a bit over the last 200 years since the lifting of the mini ice age, the clue to the pleasant bounce being obvious to all”
You are a really really good brainbox.
Yes, exactly.
“This continues the usual theme of pretending there is no warming anywhere because surface stations can’t be trusted and it’s all down to UHI.”
This is *NOT* the claim by most skeptics today. It’s a strawman argument you like to argue. The main claim today is that the temperature record is not fit for purpose and, therefore, what is actually happening is UNKNOWN!
It is the global warming crowd that is claiming they can know the unknowable. Just like the carnival fortune teller with their cloudy crystal ball – which knows all and shows all.
Yes even the IPCC wrote that it is not possible to predict future climate states because climates are coupled, non-linear, chaotic systems.
Why don’t climate “scientists” accept what their IPCC states as fact?
(hint, they won’t have anything to publish, which I’m told is not good for a researcher’s career prospects or pay)
UAH is a proxy that does not measure surface air temperatures, plus it is not “global”.
How does UAH eliminate the UHI effects in the lower troposphere?
One should notice that your references do nothing to address what the effects of UHI actually are and how they should be adjusted. You appear to not have a clue of what you are talking about.
References:
https://www.scienceunderattack.com/blog/2024/3/18/exactly-how-large-is-the-urban-heat-island-effect-in-global-warming-151
https://x.com/orwell2022/status/1808063520069575084?t=R0zTfxhtekLdwkKwuA0KfA&s=19
This fellow is using Brightness Index (BI) and photos from satellites to correlate UHI with temperature growth.
Do you have any research that shows city growth has not had any UHI effects?
Show us the unadjusted data from the class 1 sites. Same with class 2, 3, 4, 5 so we can see the differences. Short of actual data, you’re simply defending your faith in the MET and taking their word that they do unbiased data analysis. Show us the (unadjusted) data.
This continues the usual theme of pretending there is no warming anywhere because surface stations can’t be trusted and it’s all down to UHI.
No, this is not the argument. The argument is a very specific one, its that, based on the quality of the Met station network, it is impossible to say whether there has been any dramatic increase in temperature extremes.
Do you think it up to the job, and if so why, given what we know about the rating of the stations. Given also what we know about changes to the stations over the period for which the claim is being made.
The argument about warming is a quite different one: its that there has been some global warming over the last 50 or so years. Probably some of it is due to CO2 rises, though much, maybe most, is due to natural cycles. But there is nothing alarming about it, and the installation of large quantities of wind turbines and solar panels will (a) not work (b) will have no effect on the warming.
dont forget the simple, inescapable fact that is never mentioned that super fast responding digital temp measurement will pick up variation faster than old mercury thermometers; inherently leaning towards more extreme highs and lows. why has a study not been done on short term extreme temp measurment with digital vs mercury thermomenters. seems fairly simple and critical to calibrating a change in measurement techniques.
It would be very simple and inexpensive to run both systems in parallel for a substantial comparison period every time there’s a system upgrade. AFAIK they never do it.
Because electronic devices are more accurate as they can record the temperature to hundredths of a degree Celsius, after all the accuracy of a mercury thermometer is only 0.5 of the smallest division. /sarc
I get your humor.
Electronic thermal sensors have specified full range accuracies and established drift errors. None are identical.
Now, should someone care to go to the expense of characterizing each sensor and using that to adjust the individual sensor reading, a more accurate data set could be created. Lacking that, which goes beyond calibration, one must use the full error tolerances of the device. Then the data set includes the measurement error data.
The issue isn’t just the sensor, it is also the enclosure the sensor is part of plus the microclimate of the surroundings of the enclosure. For instance, as the paint on the enclosure ages it will change reflectivity/absorptivity. That will affect the ambient temperature inside the enclosure and therefore the measurement uncertainty of the temperature readings from the station. Climate science just ignores all of these contributions to uncertainty. They wouldn’t know an uncertainty budget if it bit them on the butt.
Actually the term accuracy is not correct. The term you are looking for is resolution. Most Fahrenheit LIG’s are rounded to the nearest integer which results in an uncertainty of ±0.5°F.
ASOS stations have a resolution of 0.1°F with an uncertainty of ±1.8°F.
My wife tried to convince me that our digital (spring based) bathroom scale was accurate to 0.1 lb. Her concept was based on display resolution. I looked it up in the manual and informed her the scale was accurate to +/- 2.0 lb. full scale and display resolution did not amount to any kind of accuracy for absolute measurement. But it could be use to evaluate delta measurements to the 0.1 +/- 0.05 lb. for her weight in day to day.
Scientific notation is no longer applied. The rule is, the result of a calculation can be have no greater precision than the least precise datum in the calculation.
1/2 = 0 +/- 0.5
1.0/2.0 = 0.5 +/- 0.0
1.00/2.00 = 0.50 +/- 0.00
A lot of bathroom-type scales also have problems with repeatability, step on-and-off several times and the numbers will likely vary ±2 lbs.
and I bet you got “the look” after that delivering that bit of essential info 🙂
Did you also tell her that her bum looked big in that dress?
Where I worked, I was involved with making displays that gave useful data to the operators that made decisions based on the information and/or was used in reporting (along with our SCADA system sending out accurate info to other devices). Some of our “raw” data coming in was corrupted by the connection between the device and what it was reporting. Sometimes the sensor itself was too sensitive.
For such we applied “smoothing” and “dead bands” to the programs.
But we were out to be accurate and as “honest” as we could. There could be real consequences from the public or the EPA if we really screwed up.
A related thing.
Our lab had to establish a “Method Detection Limit” for a test we reported to the EPA.
It involved, running the same test on the same sample (I think it was) 50 times. (There were lots of steps involved.)
The last step was to weigh the sample on a digital scale that could give a reading out to something like 7 or more decimal points of a gram. I think it was 95% of the decimal points had to agree to determine our labs “Method Detection Limit” for that particle test and the way we ran it.
Turned out that it was 0.5 mg/L was our lab’s reportable limit.
There was a report discussing how that was done in Australia with the concern that the replacement of old thermometers with new electronic sensors dropped the 2 year cross comparison testing and the historical record was then adjusted to align with the new sensors.
The only serious question is, how do you monitor the thermometer remotely with the ability to synchronize the data collected from the 2 sensors?
With new automated stations an attempt to match the hysteresis of LIG’s is done by averaging over 5 minutes. How well it works, I’ve not really seen a comprehensive study.
It’s something that has intrigued me since electronic sensors started being introduced. Any person who works with data has to be interested in how the new equipment compares with the old.
Or am I unusual in wanting to know how the two systems compare?
I would say: no. And i would also like to know if they had been tested in the same location to spot the exact temperature difference. I mean, side by side. Then they would have a legitimate reason to adjust one or the other.
I assume they havent done that but that might be wrong..
Since you don’t know what calibration drift and microclimate changes were in the past, adjusting those temps based on a current comparison of new and old could very well make past temperatures *more* inaccurate. The only legitimate adjustment would be to adjust temperatures from the new station to match the old station. Like everything else screwed up in climate science, climate science does the exact opposite, they adjust all the past temps to match the new station – just assuming with no actual physical evidence that the old station had to be wrong over its entire life.
Nuff said…
https://www.bbc.co.uk/news/av/world-24713504
Worked in the mining industry in the 80s and 90s? Part of the licence procedure imposed by the NCB required each mining site to install and maintain a Stevenson Screen. The collected data was forwarded to the NCB who told us that was forwarded to the Met Office.
For the mining contractor this was a “nothing job” having little to do with the requirements of the day. It was usually given to the chain boy. He worked for the surveyors carrying and cleaning their kit. The chain boy religiously collected the data each data and it was duly forwarded to the NCB.
How anyone could reasonably expect someone employed as a chain boy to accurately record the data is beyond me. At any given moment during that time there would be up to 30 or so sites providing data of doubtful accuracy for years.
Even if the data was collected by someone with a PhD in any science, I still wouldn’t trust it.
I would trust it, but only to +/- 5F.
They probably round-filed it on the receiving end.
Story tip
https://www.theguardian.com/environment/article/2024/aug/14/how-does-today-extreme-heat-compare-with-earth-past-climate
The station records have adjustments applied to remove the effects of urbanization bias from the network. But that aside, it is important to recognize that a class 4 or 5 WMO rating does not mean a station is “junk.” From the Met Office:
Yet still, regardless of this word salad, you trendology ruler monkeys routinely ignore instrumental uncertainties, whatever the magnitude.
And exactly what does “valid high-quality data” mean anyway?
It means they are just numbers and not measurements! Just like in all the stats classes.
That they are useful in establishing “averages” over a wide area!
When the uncertainty is ±5.0 or even ±3 degrees, you need to explain how an anomaly of 0.001 ±3 (or worse) is not junk!
You didn’t address once how uncertainty in the measurements plays into the issue. Show us how the uncertainty is propagated through all the averaging.
The uncertainty of class 4 and class 5 stations is not a given 2 and 5 degrees, respectively. Rather, the station readings might be biased by up to these amounts, depending on the specific siting conditions of each individual station. Most stations in the UK that are classified as 4 or 5 under the WMO ranking system have substantially lower uncertainties for the reasons laid out above.
This is also systematic bias being discussed, not random measurement error.
In other words, you don’t understand the difference, nor do you understand that uncertainty is not error.
What do you think “might” means in terms of uncertainty?
Unless you have the individual uncertainty budget for each station then you need a much better reason for assuming that each single measurement reading at a given station has a better uncertainty than that.
Didn’t your physical science lab classes in college require you to put down the serial number of each device you used so the chain of your measurements could be verified? Nothing was assumed! All your comments are nothing more than waving your hands while saying, “it doesn’t matter” while calculating to milli-kelvin.
It means that the station may have less uncertainty depending on the specifics of the siting conditions. Which is the case for the UK – many stations receive 4-5 classifications under the WMO system because they don’t get enough sunlight, this doesn’t mean there is anything wrong with the data being recorded. The classification simply indicates how representative a given station can be taken to be of its surrounding environment.
Yet you have no way to know if it is less or not! That is uncertainty. It all you can count on.
You have furnished no information to justify assuming anything but the standard values of uncertainty for each classification of a station.
Again, you are just waving your hands saying, “nothing to see here”.
Read this. ANNEX 1.B. SITING CLASSIFICATIONS FOR SURFACE OBSERVING STATIONS ON LAND
You misconstrue the sunlight requirement. That is only one of the criteria for temperature.
Here are the section titles for temperature..
Just to be clear, that is precisely why the WMO classification system can’t be used to assign uncertainty on a station by station basis. It’s just an indicator of how representative a station is of the surrounding area. The WMO and Met Office are explicit in stating that the classifications should not be used in precisely the way you are attempting to.
If the stations are not representative of a wide area then they should not be used to develop an “average” for a wide area either!
Using the anomaly eliminates most of the issue. Applying homogenization to the records cleans up any remaining siting bias.
Bullshit—the usual trendology assumption that the magic of the anomaly cancels error—it can’t.
You might understand this if you anything about metrology, but climate science treats everything as pure numbers without uncertainty.
You comment have been the usual total waste of space.
Trying to justify the scientifically unjustifiable with gibberish
The Met Office knows that most of the class 3,4,5 site will likely be highly biased in the warming direction.
They use that to create agenda-driven false trends.
No amount of mindless prattling from a low-level AGW apologist will change that fact.
Did you not read the document I referenced. Those are estimated uncertainties to be added to the uncertainty budget of the station. Nothing, I repeat, nothing in the document provides any latitude to using those additional uncertainties. Those values are in the titles to the sections describing the criteria to use. Any latitude is all in your mind, not in the document.
““WMO guidance does, in fact, not preclude use of Class 5 temperature sites – the WMO classification simply informs the data user of the geographical scale of a site’s representativity of the surrounding environment – the smaller the siting class, the higher the representativeness of the measurement for a wide area.””
If the stations are not representative of a wide area then they are *NOT* useful in establishing an average value for a wide area.
Bet yo the Met doesn’t exclude these stations when forming “averages”.
It is clear that AlanJ has never been anywhere near any science or engineering in his entire useless life.
Deliberately accepting data that is known to be very likely of low quality and almost certainly tainted towards a biased outcome.
That is what rabid cult propagandists and paid shills do… not scientists.
Nope! It is painfully obvious he’s out in the lake on a canoe without a paddle.
blah, blah. another yapping empty word salad, pertaining to nought.
These sites have changed over time, got gradually worse.
They could easily be adding a couple of degrees to any trend.
So they cannot be used to assess temperature changes over time.
So they are no longer data, just constructs.
Fake Data, like the Fake News, made up out of nothing.
They are data representing change to the climatology. They are no longer data representing the local temperature at the monitoring station.
So yes, they are just constructs as I said. Not data.
Why do you climate catastrophists have to mis-label stuff to make it sound legitimate?
It’s as transparent as “The Emperor’s New Clothes” fairy tale.
I would borrow Wikipedia’s definition of “data” as being “In common usage, data is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A datum is an individual value in a collection of data.” If you have some alternative, unique-to-you definition under which you are operating, it would be important to share that in public discussions such as this, so that everyone can be aligned.
How about “data” are verified original records whose probity and provenance are inscrutable.
sorry inscrutable is wrong word.
auto fill again.
Unquestioned
You could use whatever definition you like…
… put it into meaningless and irrelevant word-salad gibberish
You are still yapping anti-science BS.
Yep! yap-yap-yap-yap.
That you would use Wikipedia’s definition is yet another indication you don’t know WTF you yap about.
How about measurements? Measurements are what we are dealing with.
Simple data is a collection of numbers. That is why anyone who has learned statistics sees the numbers as nothing more than a number. No uncertainty, no resolution, no accuracy, no precision. I’ll bet you can’t show me one undergraduate statistics class where uncertain numbers are dealt with.
Measurements are not just numbers on a page. They convey information about what was measured, what the value of the measurement is, and provides information about the resolution the device used to make the measurement has.
All this conversation and text from you, and not one word about how uncertainty in the measurements you are dealing with. Not one word about how uncertainty adds. Not one word about how homogenization adds uncertainty. It tells those of us who have dealt with measurements that you do not see not understand the information they contain, but only that it is a number to be fiddled with.
Why do those of us who believe in measurements and not a global average is right here.
You obviously believe in Mann’s hockey stick. Tell where it is in this graph. This is a substantial area that is covered with CO2. Where is the CO2 caused hockey stick? There are all kinds of localities and regions on the globe that look like this. Why does the global average temperature have a hockey stick but none of these?
Let’s talk a little about the logic behind anomalies and how they are homogenous over large distances. How about a 1000 km circle? That means a point on the circumference should have the same value as the point at the center. Now draw another 1000 km circle around that point on the circumference. Now a point on the circumference of the second circle directly opposite from the center of the first circle should be the same as the center of the first circle. Pretty soon, the whole area is covered and all the trends are the same because they have been homogenized. What prevents the homogenization algorithm from ending up doing this?
You don’t really expect an answer to this, do you?
Even anomalies are only useful if you are measuring the *same* thing. Temperatures are never the “same” thing. And by the same thing I mean a 2″x4″x8′ board in Houston vs one in Lincoln, NE.
The variance in diurnal temps between Topeka, at the bottom of the Kansas River valley, and Hays, KS on the central High Plains is *different*. Therefore the anomalies will have a different variance. To combine the two means that some kind of weighting factor must be applied so you are comparing apples and apples instead of apples and oranges. Yet climate science doesn’t do this. They just jam everything together and say “everything cancels”.
Z-score normalisation or even something as simple as range normalisation would seem to be a better approach than subtracting a baseline average to calculate an anomaly.
The data should be available.
“The data should be available.”
Not for climate science. They don’t ever use variances or range. They just assume everything is iid and Gaussian.
standardization or normalization would allow comparison of the distributions but would they allow adding variances? How would you convert the sum of the variances back to real world values?
Because more do than don’t. Thus, the average global temperature is warming.
That isn’t what correlation is. The further you get from a given point the lower the correlation is expected to be. Homogenization algorithms only consider a station’s nearest neighbors.
Fraud.
Like San Diego and Ramona? Like Pikes Peak and Colorado Springs?
There are numerous stations in and around San Diego, so I’m not sure why you’re speaking of city names as though that is the relevant scale.
Pick one. Pretty simple.
The pairwise homogenization algorithm uses the station’s nearest neighbors (plural).
“There are numerous stations in and around San Diego, so I’m not sure why you’re speaking of city names as though that is the relevant scale”
You just proved my point!
San Diego and Ramona are about 30 miles apart. Yet the temp in San Diego can be 75F while in Ramona it is hot enough to melt the soles of your tennis shoes, 100F or more!
Temperature is determined by a whole host of factors among them are humidity, pressure, wind, terrain, geography, altitude, etc. DISTANCE is not one of the factors.
That’s the reason why temperature is an intensive property and not an extensive one. The temperature here doesn’t determine the temperature there.
San Diego is on the coast, it’s temperatures are moderated by the ocean. Ramona is inland, it’s temperature is *not* moderated by the ocean. An homogenization algorithm based on distance using San Diego and Ramona temps will be useless.
Climate science is a joke to an educated physical scientist. It’s like an agricultural scientist considering river bottom ground being the same as rocky clay on the side of a hill as far as being able to grow soybeans.
The thing that is correlated over long distances is not temperature, but the temperature anomaly. This should be intuitively obvious from common experience. If it is a hotter than normal day at my house on the hillside, it is likely an equally hotter than normal day at my buddy’s house in the valley, even though the temperature at both locations is probably very different, the anomaly will likely be quite similar. On a given day this might not be true – prevailing conditions up slope might be influenced by atypical cloudiness – but over time these random variances will tend to even out, and the common trend will stand out.
“The thing that is correlated over long distances is not temperature, but the temperature anomaly.”
More bullshite! What are the variances associated with the anomalies? Here in the central US temps in winter can easily have a range of 30F while in summer the range is 20F. DIFFERENT VARIANCES!
The different variances means the anomalies will also have different variances! How do you use statistics to compare anomalies with different variances?
They *should* be compared using weighting, e.g. aV1 and bV2, where a and b are weighting values.
How does climate science determine what a and b are? It’s not obvious that climate science even knows what the variances are for different stations let alone a weighting factor that allows them to be legitimately compared!
BE SPECIFIC! How are a and b determined in climate science?
You just described a skewed distribution. You shouldn’t have that if the sampling was done correctly accrding to the CLT. Explain how that happens. Explain how you find an accurate SEM.
Lastly, that is a simplton’s answer. Show a histogram of the distribution that proves your assertion.
The CLT states that the sample means distribution approaches normal, not that any sample should exhibit a normal distribution, nor does it state that the population will exhibit a normal distribution. If the planet is warming on average then more places must be exhibiting warming than are not, otherwise the planet would not be exhibiting warming.
Who cares if “the planet” is warming or not?
The same question arises.
Lastly, that is a simplton’s answer. Show a histogram of the Global Average ΔT distribution that proves your assertion.
The CLT allows one to calculate an estimated mean and a standard deviation of the sample means regardless of the population statistical parameters. Where:
μ = x̅
σ = SEM • √n
What is σ for a monthly GAΔT?
Note: I am not implying that is a proper value for the total measurement uncertainty, but it is a start.
A station at an airport, especially in the early days of aviation, needed to know the local weather conditions. They still do for takeoffs and landings. But those readings, especially now in “the jet age”, have become less and less reliable for any kind of “climatology” input.
Same for other sites. There was no and never has been a “global” network of sites. Even now.
Please define just how far apart from each other would a network of sites need to be to be an actual “global network”?
(And don’t forget the oceans.)
You only need about 60-90 surface stations to obtain a robust estimate of global surface temperature change:
https://andthentheresphysics.wordpress.com/2018/08/18/you-only-need-about-60-surface-stations/
But I agree that the GHCN is not a true climate observing network, it is a cobbled-together amalgamation of observing stations intended for discrete local weather monitoring. But that is the network that exists, from which our historical data can be obtained, so scientists have to work with it. That’s why they have to apply the sets of adjustments that all of you are so against.
“But that is the network that exists, from which our historical data can be obtained, so scientists have to work with it.”
“work with it” means using reasonable estimation of the uncertainty of the results. The assumption that all measurement uncertainty is random, Gaussian, and cancels is *NOT* a reasonable estimation of the uncertainty of the results.
No such assumption is made, so we are safe.
Bullshite! It’s the main meme of climate science. You’ve even invoked it yourself by saying that measurement uncertainty cancels when considering thousands of measuring stations.
it’s so ingrained in climate science that no one even recognizes that they are making the assumption!
Only the random component of error tends to cancel when considering the mean of large sample sizes. The non-random component is not expected to cancel. The difference between accuracy and precision and all that.
And you still don’t understand that uncertainty is not error.
But I’m certain you’ll keep blathering away regardless.
This fellow thinks statistical Standard Error is measurement error. That is, the more samples the smaller the Standard Error. God forbid you deal with populations and not samples. He probably believes measurements are exact numbers. I mean that is all they deal with in statistics.
You are exactly right, plus he elevates the mighty temperature anomaly to a high plateau as the be-all-end of climate while ignoring how the baseline subtraction is completely arbitrary.
This is *NOT* an issue of the difference between accuracy and precision. You have no idea of what you are saying, do you?
This is just a thing that you have made up. It’s fine that you believe it, but nobody else does. Repeating it does not bolster your case. And you can see that it is not true by the fact that estimates of the global mean converge as the sample size grows. So there is no question that a larger sample size improves the precision of the estimate.
Don’t make assertions when you are ignorant of a subject. In this case, metrology. I’ll emphasize that is not meteorology. You just appear absolutely stupid.
Maybe you should breakdown NIST TN 1900 and tell everyone how the precision of the estimate is incorrect.
In this post you have really exposed that you are a statisticians first and last. Tell us what metrology studies you have done. I know you’ve never had any upper class physical science lab class where measurements are of uppermost importance. Those are where you learn to do science.
“This is just a thing that you have made up”
It’s basic metrology!
Sample size ONLY affects sampling uncertainty. It does *NOT* affect precision. Precision of the mean is determined by the precision of the least precise data element used to determine the mean.
You are simply trying to claim that if you take enough measurements of the heights of a group of Shetland ponies and a group of Arabian horses, that you will get a Gaussian distribution.
YOU WON’T GET A GAUSSIAN DISTRIBUTION. And the average won’t be a “central tendency” of anything!
Temperature measurements of different things using different devices are the exact same thing. Thinking you will be getting a Gaussian distribution if you take enough measurements is a joke. And it doesn’t matter if you are using absolute temps of anomalies.
This kind of idiocy is based on the assumption that the distribution of temps around the earth is Gaussian and if you take enough measurements you’ll duplicate that Gaussian distribution. The problem is that the assumption that the distribution of temps around the earth is Gaussian is wrong. Just the fact that you have ocean and land temps in the mix legislates against getting a Gaussian distribution.
This is not remotely what I’m claiming. I’m claiming that if for some reason you want to estimate the average height of Shetland ponies and Arabian horses, increasing your sample size will yield a more precise estimate of the mean. And in fact the distribution of the sample means will approach normal as the sample size grows. The CLT does not depend on the underlying distribution being gaussian.
It’s a fascinating microcosm of delusion to watch you Gorman twins founder in the cult of confusion you have singlehandidly created. Someone could do a very interesting sociological study on the two of you.
Clown. Learn some metrology PDQ and stop embarrassing yourself.
“And in fact the distribution of the sample means will approach normal as the sample size grows. The CLT does not depend on the underlying distribution being gaussian.”
So what? This entire post is nothing more than the “Numbers is Numbers” meme of statisticians totally divorced from the real world!
No matter how precisely you locate the mean of the heights of Shetlands and Arabians it will tell you exactly nothing about the reality of Shetlands and Arabians!
It’s EXACTLY the same when jamming summer and winter temps together. Even the anomalies will have different variances making the comparison useless in the realworld!
The only cult of confusion is climate science whose main memes are:
7– Adjustments, homogenization, and anomaly baselines have zero uncertainty
That is actually correct, by necessity.
The offset must be regarded as a constant to allow lossless conversion between scales (either anomaly, C and K, or anomaly, F and Ra)
The period over which the baseline was calculated retains its own uncertainty, as does any other period which is re-zeroed.
“That is actually correct, by necessity.”
No, it isn’t. If the baseline average is an average of MEASUREMENTS, then the baseline temp is a “stated value +/- measurement uncertainty”.
You do not have to consider the stated value to be a constant with no uncertainty. You can convert the measurement uncertainty interval just as easily as you can convert the stated value.
A baseline of 10C +/- 1C converts to 50F +/- 1.8F
It converts to 283.15K +/- 1K
The *relative* uncertainties change as you convert scales but there is no requirement that the relative uncertainties stay constant that I know of.
That’s the point. Both the C/K and C/F conversions use constants, so are reversible.
and 50F +/- 1.8F converts to 10C +/- 1K, and 283.15K converts to 10C +/- 1C. btw, I’m intentionally avoiding the spurious precision introduced by the conversions.
The C/K conversion has a constant offset of 273.15K, and the C/F conversion has a constant multiplier of 1.8 and a constant offset of 32F
Assuming an arbitrary anomaly offset of 9.8C, 10C +/- 1C converts to 0.2A (Anomaly, for want of a better abbreviation) +/- 1A, and 0.2A +/- 1A converts back tp 10C +/- 1C.
If the offset isn’t treated as a constant, the uncertainties keep adding.
To determine that 9.8C offset, you may have had a baseline average of 9.8C +/- 0.5C, or 9.8C +/- 0.005C. Converting the baseline period from Celsius to Anomaly, you wind up with 0A +/- 0.5A or 0A +/- 0.005A respectively.
If you follow NIST TN 1900 each monthly average has a measurement uncertainty. If you declare the measurand as a random variable with 30 entries. One can calculate the uncertainty using relative uncertainties.
The pertinent equation is: (assuming ±1.8 for each)
u(b_line) / μ(b_line) = √[(1.8/μ_yr1)² + … + (1.8/μ_yr30)²]
When calculating the anomaly, the monthly mean has the μ(b_line) subtracted. That gives the form of:
μ(X – Y) = [μ(monthly_avg) – μ(b_line)]
where the variance is:
σ²(X – Y) = σ²((monthly_avg)) + σ²(b_line)
The end result is that each anomaly inherits a rather large measurement uncertainty which should limit the number of significant digits being used.
If one wants to do statistics, one must do the follow through appropriately. The other option is complaining that NIST doesn’t know what they are doing!
We seem to be talking past each other again. Certainly the temperature, expressed in anomaly degrees, has uncertainty, as does the temperature expressed in K, C, F or Ra..
My only point is that converting between Celcius, Kelvin or “anomaly” needs to be reversible, and this requires the offset to be regarded as a constant.
That’s something I’ve been pondering lately. I’m still trying to work out how the average temperatures can be expressed to more significant digits than the measurement resolution of the instruments used to take the measurements.
The thing to remember about uncertainty is that it defines an interval surrounding a measured value. That interval can be determined through a statistical analysis or by knowledge or predefined values. The GUM defines all of these as “standard deviations” that allows similar treatment such as the addition of various categories of uncertainty. As long as the distributions defined by the measured value and it’s (expanded) standard deviation, are comparable after conversion there should be no problem. If that makes them “constant”, then, yes they are constant.
They cannot. Too many statisticians take the SEM value as an indicator of how many decimal places the mean may have. That is not what the SEM does. It is the standard deviation of the sample means distribution. It defines an interval where 68% of the values the mean may take exists. It does not define the value of the mean.
We might be on the same page here. All I’m trying to get across is that re-zeroing to convert from one baseline system to another (K/C/”A”) requires the offset to be treated as a constant (eg 273.15K) to allow reversible transformations.
It appears that something similar is being done with the measurement uncertainties (resolution limit + the other factors), by adding in quadrature, dividing by the number of readings, then taking the square root.
That doesn’t seem correct, because the resolution limit should be irreducible without higher resolution instruments.
Metrology isn’t my field, so this may be barking up the wrong grove, let alone the wrong tree.
The analogy I was toying with is measuring a hypothetical ream (500 sheets) of 90gsm printer paper (nominal 112 micron) with a 0.01mm (10 micron) micrometer.
The ream straight out of the packet measures 56.00 mm +/- 0.005mm. Average sheet thickness is 0.112mm (I think, +/- 0.0001mm – probably wrong)
After fanning the paper, the ream measures 56.12mm +/- 0.005mm (air gaps between pages). Average sheet thickness is now 0.11224mm +/- 0.00001mm.
These are both legitimate, because we just scaled from the direct measurement of the ream.
However, measuring each individual sheet gives 0.11mm +/- 0.005mm. Adding the sheets together gives 55mm +/- 2.5mm. The ream measurement is within this range.
Adding the uncertainties in quadrature gives sqrt (0.0125 mm) or 0.112mm. The actual ream measurement lays well outside 55mm +/- 0.112mm.
Did I muck up adding in quadrature? sqrt (0.005 ^2 * 500)
“It appears that something similar is being done with the measurement uncertainties (resolution limit + the other factors), by adding in quadrature, dividing by the number of readings, then taking the square root.
That doesn’t seem correct, because the resolution limit should be irreducible without higher resolution instruments.”
You still use the significant figure rules with the measurement uncertainties.
Wouldn’t that be better expressed as:
“You should still use the significant figure rules with the measurement uncertainties.”?
No “should”. “Should” implies that not doing it still gives a usable result. It doesn’t.
“Should” just means it’s supposed to be done, but it’s on the cards that it isn’t being done.
“Did I muck up adding in quadrature?”
The first thing you have to do is determine if you are going to add directly or in quadrature. Since you have assumed all the sheets are identical you probably want to add the uncertainties directly. Doing quadrature assumes you are getting some partial cancellation of random uncertainties across multiple measurements.In your example that doesn’t happen.
Okay. Thanks.
It seems a reasonable assumption in this case.
The underlying rationale of temperature anomalisation and homegenisation seems to be to make all sites “equal”. Perhaps some are more equal than others.
Quite so. What circumstances would warrant adding resolution uncertainties in quadrature?
In this case, I was thinking of the resolution uncertainty of the measuring instrument, rather than homogeneity of the measurands.
If I went to a 0.001mm micrometer there would likely be some variation in readings, but the discrepancy between ream thickness and the sum of sheet thicknesses would almost disappear.
See the last sentence in the picture from Dr. Taylor. “Of course, the sheets must be known to be equally thick.” If this requirement is not met, the result is basically meaningless in terms of what each sheet measures.
One example is finding the area of a rectangle. If the length is on the low side and the width on the high side (or vice versa), you may get some cancelation. Another is putting two 2×4’s end to end. One may be short and the other llong, whereby the total is has less variance and the uncertainties will cancel.
The real question is where the uncertainties come from. Is each measurement of the short side done multiple times under repeatable conditions. Same for the long side having multiple measurements under repeatable conditions. That would be an excellent use of the standard error of the mean for each uncertainty.
The problem is that the next rectangle may have different measurements and uncertainties. If you simply used what you had on the first set, you could be terribly inexact. This can occur for however many you create. This is where single measurement theory comes into play. If you are looking at a number of sets and calculate their average area, and use the standard uncertainty of the mean added in quadrature, you won’t be telling a customer what the actual variance between each rectangle is. You should use the standard deviation in this case.
You are doing this just like Dr. Taylor does in his book.
An Introduction to Error Analysis, The Study of Uncertainties in Physical Measurements
Now let’s discuss what is going on here. By dividing by 500 you are finding the width of an average sheet. By dividing the uncertainty by 500, you are creating an average uncertainty for the average width. The huge assumption behind this is that every piece is EXACTLY the same. To find the uncertainty of the whole, simply multiply by 500.
Tim and I have warned about this quite often. In most cases, one is not concerned about the average but the variance of the distribution of all 500 pieces. To do this adequately, the average uncertainty is a good starting point for determine the capability of a measuring device. To detect the average uncertainty one would need a device capable of measuring to 0.00001 mm.
Why is the standard deviation of the 500 needed? Because you may have a copier that jams when sheets are too small or too large. To judge how many sheets might jam, you need to know the standard deviation and not the average uncertainty.
What is happening is a fundamental assumption in measurements. Repeatable measurements OF THE SAME THING let one use the standard deviation of the mean to gauge the uncertainty of THAT ONE THING. See B.2.15 of the GUM for appropriate conditions.
When you make single measurements under reproducible conditions (see B.2.16) the standard deviation is the only viable measurement uncertainty.
I’m in rather exalted company there 🙂
That may have even been the germ of the idea.
I’ve stipulated above that each sheet was measured at 0.11mm +/- 0.005mm, so the variance is 0. As a bonus, we have measured the entire population, ao we have the population mean.
True, the (reasonable) assumption of opening and measuring a sealed ream of printer paper is that the sheet thicknesses are as specified.
The next step of the thought experiment is to add in reams of paper of different weights to introduce sampling error.
Then to subtract the nominal thicknesses to determine “anomalies”.
After that, to add in reams from other manufacturers.
Mixing and matching metric paper and imperial paper would add some spice, as would duplicating the measurements with both metric and imperial micrometers of comparable resolution. Possibly even add in the use of Yum Cha micrometers of dubious accuracy, and miscalibrated micrometers.
It’s still rather nebulous, so all suggestions are welcome.
That includes “the mathematicians”, although they all seem to have left the building.
Most mathematicians/statisticians are only trained on numbers that have no uncertainty. Every number is ±0. Most have never taken a junior/senior lab class in any physical science so that they learn about resolution, measurements, and uncertainty. Most have never had a job repairing equipment where tolerance in the thousandths or ten thousandths of an inch make all the difference in a product that works as expected or is a runt. They have no idea how hard it is to make those measurements nor how bad they can go.
That is quite reasonable. The object of the exercise is to teach the theoretical basis of the equations and where they are applicable. Measurement uncertainty is a confounding factor, and belongs in a higher level course once there is a solid grasp of the foundational elements.
As an aside, almost all statistical work is done on counts rather than measurements. Uncertainties tend to come from sampling.
Physical sciences certainly should build on the foundational statistical aspects by adding measurement uncertainty. Given the example given of a practicing astronomer who thinks a sample size of 60 is sufficient, that may not always be the case 🙁
One nice thing about older machinery is that tolerances and wear limits aren’t that tight. You can get away with feeler gauges and 0.001″ micrometers. Calipers and Plastigage might be pushing it, though.
Find an upper level stats book that deals with uncertain numbers – you won’t. Unless you major in metrology it remains ±0.
That is more complicated than most realize. That is where statisticians fall down. Uncertainty in measurements doesn’t arise from sampling error. Uncertainty in measurements arises from variations in measurements and not from sampling a population incorrectly or imperfectly.
Those variations in measurements can be analyzed AND characterized using statistical methods. That allows standardized characterization that everyone understands, hence the GUM.
If one measures the SAME THING many times (repeatable conditions) AND the resulting distribution of measurements is Gaussian, then one can use the mean and standard deviation of the mean as the characterization of that method of measurement and measurand. If the distribution is uniform, triangular, or worse, skewed, then other statistical parameters must be developed, hence the GUM.
If one measures multiple non-repeatable measurand’s (temperatures) one time each, then each SINGLE measurement has no distribution to analyze statistically. That doesn’t mean there isn’t an uncertainty one can use for each measurement. You can use a Type B value if known, for instance, NOAA says ASOS has an uncertainty of ±1.8°F. In this case however, the spread or range is the important uncertainty in a mean of the values. In this case, the standard deviation can characterize the expected range of values (if the distribution is Gaussian). This type of uncertainty is known as reproducibility conditions. That is different from repeatable conditions.
I forgot to add that if you dig into ISO requirements for an uncertainty budget, you will find separate items must be included for repeatability conditions and reproducibility conditions.
NIST also covers much of this in their Engineering Statistical Handbook.
Whether we like it or not, metrology is a (widely used) specialised field of Applied Statistics. Econometrics is another specialised field.
Couldn’t say it better myself. The statisticians amongst us should realize that metrology is a legitimate specialized field of study and is used all over the world in measuring physical characteristics.
The statisticians do. The mathematicians; not so much.
I must have worded that unclearly, sorry.
That was referring to statistical uncertainty where counts are involved rather than measurements. It’s one of those “build a robust foundation first.” things. Get an understanding of the field using exact numbers first, before adding the complexities of fuzzy numbers.
Hey, no problem. I just took it as an opportunity to discuss more about measurements and their analysis. Too many here remember their statistics classes from high school and college where a large amount of time was spent with sampling.
I’ve even had people try to tell me that single temperature measurements are a sample of the global temperature population and if you have 10,000 stations, you can divide by the √10,000 to get the Standard Error. They neglect to recognize that a single temperature reading is a sample SIZE of 1. I have people tell me, no you can treat it as a single sample of size 10,000. But then the CLT doesn’t apply because you need enough samples to create a sample means distribution. I wish that was called something else because it leads some to think the mean of a singel sample is a sample means distribution. That isn’t so.
The real problem is that folks need to understand about taking measurements, how they can vary, how devices have resolution, what non-repeatable, reproducible single measurements are, what repeatable measurements are, etc., before they ever touch the statistics side. Junior and senior level physical lab classes where a detailed lab book is required showing id’s for each measuring device used, detailed records of readings, experimental setup, etc. and are required to pass the course or necessary on-the-job experience are the basics.
You will enjoy this in that case.
See what they did here? They threw away all measurement uncertainty that should be propagated into the anomaly values.
Typical statistician operating procedure. The values are 100% accurate and the only error is the sampling error. What a joke!
“Station uncertainty encompasses the systematic and random uncertainties that occur in the record of a single station and include measurement uncertainties, transcription errors, and uncertainties introduced by station record adjustments and missed adjustments in postprocessing. The random uncertainties can be significant for a single station but comprise a very small amount of the global LSAT uncertainty to the extent that they are independent and randomly distributed. Their impact is reduced when looking at the average of thousands of stations.” (bolding mine, tpg)
“The major source of station uncertainty is due to systematic, artificial changes in the mean of station time series due to changes in observational methodologies.”
“The homogenization process is a difficult, but necessary statistical problem that corrects for important issues albeit with significant uncertainty for both global and local temperature estimates.”
Random uncertainties only totally cancel if you are measuring the same thing multiple times using the same instruments under the same conditions. Otherwise they add in quadrature (due to partial cancellation). Temperatures are single measurements of different things using different instruments. Therefore the random uncertainties do not reduce, they GROW.
Artificial changes due to observational methodologies certainly add to measurement uncertainty but they are *NOT* the only components of systematic bias in readings. Calibration drift and microclimate changes are big factor as well. And this paper doesn’t address these at all. This paper came out in 2019 and totally ignored the results that Hubbard and Lin found in 2002 where they concluded that you cannot adjust temperatures on a regional or subset basis because of the impact microclimates have on readings. Adjustments have to be done on a station-by-station basis, homogenization will *NOT* work to reduce uncertainty due to calibration drift and/or microclimate changes. Even then you can’t just willy-nilly go back and change all historical data because of the gradients associated with calibration drift and microclimate changes over time. You might actually be making past readings *more* inaccurate instead of more accurate.
This paper really doesn’t address the formulation of an uncertainty budget at all. It simply propagates the climate science meme that all measurement uncertainty is random, Gaussian, and cancels – no matter what the measurements represent or the collection methodologies used to generate the data. It makes things simpler but it also gives a false impression of just how much knowledge can be actually gleaned from the data. Getting differences in milli-kelvin’s from data recorded in the units digit with uncertainties in the tenths digit is impossible – unless you are a climate scientist.
I thought you chaps would enjoy it 🙂
Of the authors:
Nathan Lenssen is a statistician.
Gavin Schmidt is a mathematician.
James Hansen is a physicist.
Matthew Menne is a meteorologist/climate scientist.
Avraham Persin is difficult to find a field for.
Reto Reudi is a mathematician.
Daniel Zyss is also elusive.
Hansen, at a minimum, should understand measurement uncertainty and propagation of error. This is why I wonder if it is valid to use statistical measures to “increase” resolution beyond the capability of the instruments.
It is not valid. Find university physical lab instructions that are online, 99% tell you that you can’t report a finding beyond what you measured. The other 1% don’t mention it. Textbooks providing information about measurements say the same thing. Significant digits are also a part of the bible of handling measurements. How do you make measurements of temperature with two significant digits and end up with three in the measurement and 4 in the uncertainty?
I’ve asked many folks here to provide a reference that says you can do that. I’ve never had a response.
That’s what my background says as well, but that’s only at undergrad level. It’s always in the back of my mind that there might be something at postgrad level which provides an exception to the rule.
I will argue until I’m blue in the face that it’s valid to report averages and s.d.s to more sig figs than the underlying data. The resolution uncertainty seems to be irreducible, though.
For example, is the average of {1.0, 1.0, 1.0, 1.0, 1.0, 1.1, 1.1, 1.1, 1.1, 1.1} 1.05 +/- 0.05 or 1.1 +/- 0.05?
“I will argue until I’m blue in the face that it’s valid to report averages and s.d.s to more sig figs than the underlying data.”
Nope. The average can’t be known past what you *know*. Same with the standard deviation.
Giving a stated value, e.g. the average, to a magnitude beyond the associated uncertainty means you you *know* something that is part of the GREAT UNKNOWN. It implies you are more certain of the stated value than the uncertainty you give in the answer provides for.
A given average value of 9357.6 m/s +/- 30 m/s doesn’t make measurement sense. The +/- 30 m/s implies that the tens digit in the answer can vary between 2 and 8. The trailing 7.6 simply can’t be known based on the uncertainty. The 7.6 is just plain insignificant when it comes to the average value. What the units digit and the tenths digit are is unknowable in the face of the measurement uncertainty.
This is why climate science is so adamant about their meme of all measurement uncertainty being random, Gaussian, and cancels. Then the average can go out as many digits as they need it to in order to justify their grant money!
That implies that the mean means something. All it is is the ratio of sum over count.
It’s valid to add as many sig figs to the mean (and s.d.) as the approximate order of magnitude of the denominator. It is still fuzzy, though.
The measurements and the statistical descriptors are telling you different things.
Rounding to the 10s place would give a range of 9360 +/- 30, so 9330 to 9390.
You’re looking at it wrong.
The trailing 7.6 adds 2 sig figs, so it tells you that the sample size is on the order of 10^2.
The +/- 30 tells you the uncertainty.
Rounding the mean and s.d. discard information.
“It’s valid to add as many sig figs to the mean (and s.d.) as the approximate order of magnitude of the denominator. It is still fuzzy, though.”
Not for measurements. You are assuming the “numbers is numbers” meme of climate science.
The significant figure rule is based on the numerator and its “stated value +/- measurement uncertainty”. None of the stated values in the numerator should be given any more decimal places than the uncertainty allows. When the stated values are summed the sum should have no more decimal places than the uncertainty allows.
“Rounding to the 10s place would give a range of 9360 +/- 30, so 9330 to 9390.”
The stated value is the best estimate you can make of the measurand’s property. You truncate the estimate at the place the uncertainty provides. The uncertainty overwhelms any digits after it. You don’t *know* the 7.6 value so you can’t use it to round off the stated value. What your measurement should be is the 9350 +/- 30.
Again, the is the same error climate science makes. They assume they know more than they can possible know because of the uncertainty of the measurements.
“You’re looking at it wrong.”
No, I am looking at it as an engineer making measurements. I don’t know what I can’t know.
“The trailing 7.6 adds 2 sig figs, so it tells you that the sample size is on the order of 10^2.”
Again, you can *NOT* know the 7.6 since it is past the uncertainty. You simply don’t know where in the interval of +/- 30 the actual value might lie so you can’t “round” anything. It’s part of the GREAT UNKNOWN.
The uncertainty of the average of the measurements is *not* based on the sample size. The sample size determines the interval in which the mean might lie based on the sampling uncertainty but it is based solely on the stated value of the measurements. The propagated uncertainty of the individual measurements *adds* to the sampling uncertainty.
The sampling uncertainty is actually based on the standard deviation of the sample meanS (plural). The data in each of those samples should actually be given as “stated value +/- measurement uncertainty”. So the mean of each sample should be given as “stated value +/- propagated measurement uncertainty”. The various means of the samples then form a data set of “stated values +/- measurement uncertainty”. The mean of those sample means then become a value of “stated value +/- propagated measurement uncertainty of the sample means”.
If you have ONE sample then you have to ASSume that the standard deviation of the sample is the standard deviation of the population in order to come up with the sampling uncertainty. That’s a BIG DAMN ASSUMPTION to make. You wind up with the SEM = (assumed standard deviation)/sqrt(n). Again, climate science just ASSumes that the standard deviation of their sample *is* the standard deviation of the entire global temperature population – without ever providing any justification for that.
Since the variance of cold temps is greater than the variance of warm temps, i.e. one hemisphere vs the other, there is no guarantee that combining NH and SH temperature samples will be Gaussian and, therefore, the assumption that the sample standard deviation is the same as the global temp population becomes questionable.
I’m about to run out of space. Put the climate science mems out of your mind – they simply violate all the metrology and significant figure rules of physical science.
Yep, the numerator has the same number of decimal places as the individual measurements which are summed.
The denominator is the order of magnitude of the number of elements in the set.
Your “9357.6 m/s +/- 30 m/s ” is a trick question, and you caught me not paying attention. Both the numerator and denominator are measurements, so both have uncertainties.
If you change that to “average weight of 9357.6 lbs +/- 30 lbs”, the denominator is a count, hence it has no uncertainty.
In its cleanest form, that gives 100 measurements with a total weight of 935,760 lbs +/- 3,0000 lbs.
The mean is (935,760 +/- 3,000) / 100.
The mean may not mean anything much, but it is what it is and that’s all what it is.
The greater concern is that you can have a total of 935,760 +/- 3,000. Are we weighing elephants and Shetland ponies, or prime movers and bicycles?
let u(x) be the uncertainty of x.
If q_avg = SUM/n
then
u(q_avg) = u(SUM) + u(n)
Since “n” is a constant u(n) = 0
u(q_avg) = u(SUM)
so you wind up with “q_avg +/- u(q_avg) ==> q(avg) +/- u(SUM)
Bellman, alanj, bdgwx, et al have a problem understanding this. They want to say that u(avg) = u(SUM/sqrt(n). That’s not the MEASUREMENT uncertainty of q_avg, it is the SAMPLING uncertainty of q_avg. The larger the sample is, i.e. as “n” grows, the sampling uncertainty gets less. But that is *NOT* how measurement uncertainty is propagated.
Now u(SUM) can either be a direct addition of all the uncertainties (worst case) or a quadrature addition of all of the uncertainties (best case). But it is still u(SUM). The average uncertainty might be u(SUM)/n but what does that buy you? If the elements are all different values with different measurement uncertainties you can’t discern much about what you have by using the average length and/or the average measurement uncertainty.
Think of it this way. avg and u(avg) are nothing more than fictional numbers that, when multiplied by the number of elements, gives the same total as adding up the different individual element stated values and uncertainties.
You have a pile of i-beam scraps left over from a construction site. You want to know how many linear feet you have so you can calculate salvage value. You can measure each one and get a total of “feet +/- uncertainty”. You can then calculate the average length of i-beam you have and what the average measurement uncertainty is. What does that average length and average measurement uncertainty give you? When you take the metal to the scrap yard they are going to want to know the total length and the total measurement uncertainty. You’ll want paid based on the + uncertainty and the scrap yard will want to pay based on the – uncertainty!
If you are going to use the scrap pieces at another construction site you need to know the standard deviation of the what you have to know how many will work and how many won’t. The average won’t tell you anything by itself, especially if the variance of the i-beam lengths is large.
There seems to be some confusion over finding the properties of an individual element in a set of elements that are exactly the same is somehow REDUCING the measurement uncertainty. It’s not REDUCING anything. It’s merely finding the individual element properties by first finding the total properties of the set. But the important piece is that all the elements have to be exactly the same. If they aren’t exactly the same then finding the total properties doesn’t help you find the properties of any individual element. And that is especially true if the property is an intensive one like temperature.
Shouldn’t the relative uncertainty of the average be the same as the relative uncertainty of the sum?
Let’s discuss the two main types of measurements, repeatable and non-repeatable (single measurements).
Repeatable measurements and repeatable conditions mean you are measuring the EXACT SAME THING, numerous times under repeatable conditions. The same procedure, the same device, the same operator, the temperature, the same humidity, the same everything in a short time before things can change. It is one reason for having an environmental chamber that has repeatable conditions for measuring very, very small quantities. Examples are weighing the same gauge block once, then removing from the scale, recalibrate the scale (follow the same procedure), place the same gauge block on the scale with the same force, etc. and repeat 100 times.
When you do that, you can plot your points to insure the distribution is normal. Then you can assume that same piece has an uncertainty that is equivalent to the standard deviation of the mean (divide σ/√100). That will be the uncertainty of the average of all the measurements. BUT, it only applies to that ONE PIECE you have measured multiple times.
Now, how about single measurements? A 100 m race, a drop from a pipette, filling a beaker to a graduation, gathering scrap 2×4 pieces from a construction site, a group of gauge blocks. Or, in our case temperatures at different times or days. These are all measurements of things that change from one measurement to another. They are called single measurements. When you group them together into a random variable such as a monthly average temperature, the mean (average) is not the center value of a distribution surrounding one measurand.. The mean is only the “middle” value of the group of dissimilar measurands.
We’ve used this example before. You are a quality manager of a process of making pins that meet a specific length requirement. You sample every piece and measure it once. You gather 10,000 in a day. You have a mean length of 6 inches and a standard deviation of 0.2 inches. You figure the standard deviation of the mean is the proper choice here so you divide 0.2/√10000 = 0.002 inches and tell your salesmen that they can tell customers your pins are 6 inches ± 0.002 inches. Are you telling them the true variation in the pins?
Here is what the GUM says.
Dispersion of values. Does the standard deviation of the mean really tell one what the dispersion of the values of single measurements of non-repeatable measurands truly is?
You’ve nailed it. The mean doesn’t really mean anything. beyond being a measure of centrality. You really want the 3 Ms, (variance|standard deviation) and sample size to get much of an idea
Not exactly. If we are going to get into the weeds there needs to be a whole lot more known.
Let’s assume the measurements were taken under reproducible conditions. IOW, this is a group of independent measurements of different but similar things. How about the speed of a satellite. The speed has been measured to the nearest tens.
We now have a random variable Speed with data. Let’s say the are 100 measurements.
The measurement uncertainty of reproducible measurements is the standard deviation of the random variable. The mean of the random variable is stated value.
This would make μ = 9357.6 and σ = 30.
If the data is read to the nearest tens digit, we can not quote the mean any more accuratly than that. So μ ≈ 9360 and σ = ±30.
From An Introduction to Error Analysis, Dr. John R Taylor.
I’ll give you your rounding as long as you provide at least the reading resolutions, total, and sample size (the full data set would be better).
Getting back into the safer ground of “numbers is numbers”, if I count 935,760 widgets in 100 cartons (small widgets or large cartons), I wind up with an average of 9,357.6 widgets/carton.
If that’s being reported externally, it would probably be rounded to 9,358 or even 9,360.
If it’s internal QC data, I want the unrounded number.
If the next check gives 935,840 widgets in the 100 cartons, they round to the same values, but we’re now averaging 9,358.4 widgets per carton and we’re losing revenue.
The full data sets provide more information, but the summary data gives useful information at a glance.
But remember, the average is 9360 ±30. That is a range of 9330 to 9390. That gives shipment volume as 93300 to 93900. You win some and you lose some. The trick is to minimize the variation. 3000/93600 is 3.2%. Not real bad, but could be much better.
Sorry, my math was off, not enough coffee.
S/B 933,000 to 939,000 3000/936,000 = 0.3%
You pose a difficult problem. With the numbers you have given, one could suppose an uncertainty of ±0.05. In other words the hundredths digit is where the uncertainty is.
Questions.
Are these measurements under repeatable conditions or reproducible conditions? If repeatable an expanded standard uncertainty of the mean could be used. If reproducibile then the standard deviation would be appropriate.
Are they of the same measurand?
Is the distribution Gaussian or something else? It doesn’t look Gaussian at first glance.
Let’s assume you can add the uncertainties in quadrature by using relative values.
u(y)/1.1 = √[5(0.05/1.0)² +5(0.05/1.1)²]
u(y)/1.1 = √[0.0125 + 0.0105]
u(y)/1.1 = √0.023 = 0.15 ≈ 0.2
u(y) = 1.1 • 0.2 = 0.22 ≈ 0.2
So you have 1.1 ±0.2 => (0.9 to 1.3)
Now let’s look at a Gaussian distribution.
μ = 1.05 => 1.1
σ = 0.05 => 0.1
SDOM = 0.1 / √10 = 0.03
Expanded SDOM = 0.03 • 2 = 0.06 => 0.1 => (0.95, 1.15) at 95%
Using the NIST Uncertainty Machine, the values from Monte Carlo runs of various distributions were all around an interval of (1.0, 1.2) at 95% .
Take your pick. I like adding the relative uncertainties in quadrature because you don’t need to wonder about the distribution shape.
That’s getting into the weeds.
They are measurements of the same 1.04995″ gage block stack. taken by the apprentice with a 6″ pocket rule graduated in tenths of an inch 🙂
Can’t help it. That is what the data shows.
Look at the Gaussian that assumes repeatable conditions. 0.95 is 0.05 less than 1.0 and 1.15 is 0.05 higher than 1.1. That is, the low value 1.0 – 0.05 and the high value is 1.1 + 0.05.
Kinda what one might expect with these numbers
and 1.05 is the midpoint of the range.
Increasing resolution beyond the capability of the measuring instrument is simply claiming that you have a cloudy crystal ball and within that cloudy, swirling mess you can discern whatever you want.
The concept of the GREAT UNKNOWN seems to be one that far too many scientists today, especially climate scientists don’t understand and, therefore, simply wish it away.
To paraphrase Don Rumsfeld, you simply can’t *know* what you don’t know.
The last significant figure in an answer should be of the same order of magnitude as the uncertainty. If your measurement uncertainty is +/- 0.3C then your stated value should not be given past the tenths digit.
It simply doesn’t matter what the resolution of the instrument is if the uncertainty overwhelms it.
That’s what I was taught as well.
Yes, but isn’t the resolution limit the lowest possible uncertainty?
“Yes, but isn’t the resolution limit the lowest possible uncertainty?”
No. My Fluke frequency meter has a resolution of 8 digits. When set at the 0.1hz range you can read out to four decimal places. But the measurement uncertainty of the instrument remains +/- 10hz because of the uncertainty of the time base oscillator in the meter. All the increased resolution allows is comparing two different frequencies but I have never given that any credence because I doubt the time base is that stable.
It’s the same with most instruments. Even micrometers suffer from wear so you can’t just depend on the resolution of the instrument to be the uncertainty.
Temperature measurement stations are no different. Even the aging of the paint on the station enclosure can cause calibration drift over the long term. That’s a microclimate effect and can’t be corrected by the actual sensor. Stick a new, calibrated sensor in an old enclosure and you won’t get a “perfect” reading because of the aging of the enclosure. No amount of homogenizing or infilling is going to help with this. So what does climate science do? They just pretend measurement uncertainty is zero so they can come up with milli-kelvin anomaly averages.
Not necessarily. One of the ISO documents, I don’t remember which discusses using an estimate of how many needle widths will fit between graduations on an analog dial. But, over time the spring in a meter can weaken and change what the actual reading is. Some meters have more damping than others. All kinds of different parts and pieces that each can contribute to uncertainty. This is the reason calibration intervals are needed so one can not only have a correct reading but a correction table that may change over time.
Digital devices have their own quirks and problems. You just can’t assume they provide correct values in the last digit. It can all depend on what the resolution of analog to digital conversion and how it is handled when “rounding”.
If you dig into isobudget.com, they have some sample uncertainty budgets. It is enlightening as to what ISO certification requires in terms of investigating and determining what uncertainty actually is.
I asked:
“Yes, but isn’t the resolution limit the lowest possible uncertainty?”
Tim replied:
“No. My Fluke frequency meter has a resolution of 8 digits. When set at the 0.1hz range you can read out to four decimal places. But the measurement uncertainty of the instrument remains +/- 10hz because of the uncertainty of the time base oscillator in the meter.”
Jim replied:
“Not necessarily. One of the ISO documents, I don’t remember which discusses using an estimate of how many needle widths will fit between graduations on an analog dial.”
I’d pay to be a fly on the wall at the next family gathering 🙂
Quoting out of context can be lots of fun…
I’m going to be offline for a few days.
It’s been a pleasant discussion, chaps. Catch you next week.
Yes its true, this person has no idea about real metrology.
Statistics is not your friend.
Your use of “error” shows you are ignorant of metrology or that you are a statisticians who is using Standard Error of the Mean incorrectly.
You need to download JCGM 100:2008, i.e., the GUM, that is the internationally accepted document for measurement uncertainty. You will find “true value” and “error” are deprecated, and the new paradigm is measurement uncertainty.
Single measurements of temperature NEVER have a large sample size. There is only a size of one. There is no way to analyze a single measurement with statistics for ucertainty. That means you must use a Type B uncertainty. NOAA provides this for ASOS at ±1.8°F and CRN Aat ±0.3°C. Remember, each and every measurement carries this uncertainty.
Declaring a measurand of something like Tmonth_average let’s you create a random sample of ~30 days of single measurements at a station. You can read NIST TN 1900 Ex. 2 to see how they recommend finding the mean and measurement uncertainty for reproducibile results. You’ll be surprised how large it is. And they didn’t include the value for single measurements in the example.
Maybe you can tell everyone what happens to variance when you subtract two random variables to calculate an anomaly.
What is the surface area of our Globe? Only “60-90” sites are enough?
You say only 90 sites (ill sited or not) can give us the temperature of the globe!!??!!
Even if that were true, where are those 90 pristine sites?
Where where they 100, 200, or 300 years ago to tell us anything has really changed?
No matter what stations are used and no matter where they are located, current technology has a measurement uncertainty in the tenths digit at best. When combining measurements of different things the measurement uncertainty of each element ADDS to the total. That means that if you have 90 stations the total measurement uncertainty will be in at least the units digit. It then follows that it is impossible to tell differences less than the units digit. So how is that going to help determine the global temperature?
The fact is demonstrated quite clearly in the linked article. After you reach about 60-90 stations, the results begin to converge quite strongly. So in fact it is the case that we have considerably more station records than necessary to produce a robust estimate of the global mean.
That’s fine from a number standpoint. Now explain what that means from a physical measurement standpoint.
You are dealing with measurements. Do you think measurements converge to one physical value?
Why are there things like contour maps or pressure maps if the numbers converge to a single value?
It means that we have obtained a sample size large enough to adequately represent in the underlying population – the central tendency of the data is being captured. There is no expectation that the individual measurements be identical.
ROTFLMAO! Right out of a high school stats course.
Not ONE mention about the uncertainty in the measurements.
Here is a question for you. Are the 30 daily temperatures at a single station a sample of something or the entire population of say Tmax.
Sample size only determines sampling uncertainty, not measurement uncertainty.
The two uncertainties ADD to determine the uncertainty of the average.
The “central tendency” only tells you something if you ASSume a Gaussian distribution.
Since the data you collect is likely to be skewed, the average, by itself, is *NOT* an adequate descriptor. It’s not even adequate by itself for a Gaussian distribution. You *also* need to know the variance, kurtosis, skewness, median, and mode of the data in order to properly analyze it.
As usual, you have embedded the assumption that everything is random and Gaussian. It’s something that the defenders of climate science just can’t seem to shake.
Sample size affects the uncertainty arising from random measurement error (i.e. a larger sample increases the precision of the estimate of the mean). It does not affect uncertainty arising from systematic error. This is something you constantly try to fight despite the fact that you agree with it, leaving all of us sane observers puzzled.
You are so far off base I don’t even know where to start.
Give us a reference about measurement precision can be increased multiple measurements.
Here in mine. Read this very carefully. It discusses resolution uncertainty in measurements. It is why neither statistical analysis nor averaging can increase the precision of measurement. It is why Significant Digit rules were created and are used in metrology.
https://www.isobudgets.com/calculate-resolution-uncertainty/#Convert-resolution-uncertainty-to-standard-uncertainty
If I have 100 integer measurements and μ is 5.639 and σ is ±3.4, I cannot divide 3.4 by the √100, then say I know the mean is 5.64. The mean is 6 ±0.3. You simply can increase the precision beyond the measurement resolution.
Statistics teaches you to state the mean as 5.64 +0.34, but you can’t justify that when dealing in measurements that only have a given resolution. That is very misleading to anyone that sees it.
You really need to learn about uncertainty budgets and how they work.
You need to stop speaking past people. I did not say measurement precision can be increase through multiple measurements, I said that the precision of the estimate of the mean can be improved by taking a larger sample. And this is unquestionably true.
The almighty mean!
But you just claimed you don’t need a lot of samples, which is it?
Free clue—glomming air temperature measurements together from disparate locations is NOT an exercise in statistical sampling.
He’s not going to understand what you’ve said at all!
Nope! He pushes the exact same stuff that bellman pushes.
“ I said that the precision of the estimate of the mean can be improved by taking a larger sample”
So what? All this means is that you have precisely located a value that is inaccurate! The accuracy of that calculated mean is *NOT* the sampling uncertainty (i.e. the mislabeled “standard error of the mean”). It is the propagated uncertainty of the measurement uncertainty from the data elements in the data set.
Nor is the precision of that mean determined by how many digits your calculator has. If the data elements are all given in the units digit then your mean can’t be more precise than the units digit. You can’t take temperatures recorded to the units digit and develop a mean with a last significant digit in the thousandths or hundredths digit. That’s violating the significant digit rules of physical science. Yet is is *exactly* what climate science does!
The point is that the only way to quote temperature to a milli-kelvin is to have a measurement uncertainty of 0.0005 and to obtain that it is necessary to divide by a large sample size. However, each monthly sample consists of only about 30 days. That makes n=30 and √30=5.5.
You have yet to indicate how measurement uncertainty is propagated in the GAΔT calculations. You might explain how a measurement uncertainty of ±1.8°F of each single measurement gets reduced to ±0.0005°F.
How does one have more samples in a monthly average that is limited to the number of days in the current month? How does one have more samples in a baseline average of 30 years?
You don’t even have a clue what the standard deviation of the sample means (SEM) tells you. It has nothing to do with the “precision” of the mean, it defines the interval within which the mean may lay. IOW, the mean may have a value of 1.0, but the SEM can be ±0.001. That is an interval, not a definition of the precision of the mean.
Just stop, you have no clue about the word you are abusing here.
Care to post a lits?
The surface area (land and oceans) of the Earth is 197 million square miles. Ninety operational stations, if all evenly distributed across that area (hah!), would therefore result in each station “monitoring” a surface area of about 2.2 million square miles. That’s equivalent to a circle of radius of about 835 miles!
Since those supposed 90 stations will necessarily not be evenly distributed geographically, some stations will likely have to cover surfaces areas of equivalent radius of 1700 miles or more.
Data from single point centered within an area of 2.2 million square miles—to say nothing of an area four or more times that—will be representative of the average “climate” of that region? Who are you kidding???
Simple math shows how RIDICULOUS is the assertion that something around 90 monitoring stations is all that is needed for monitoring global climate.
That, or on your planet your definition of a “robust estimate of the global mean” is different from mine.
What does the average temp between Chicago and Rio de Janeiro on August 1 of each year tell you about the global climate? Since winter and summer temps have different variances what does the average anomaly tell you?
The change in the mean anomaly through time is what tells us something about the climate. If we had just a single station in Chicago and a single station in Brazil, we would have poor global coverage and would not presume to know much about the global climate.
You have a bad case of brain anomaly, recommend recision.
Yet you have said only 60 stations is enough to give an accurate global average. That is *POOR* global coverage as well. That is not enough stations to cover all of the climate zones around the globe let alone the hardiness zones! There are about 12 climate zones and 25 hardiness zones. 60 stations simply wouldn’t give you enough *LAND* stations to adequately sample either one – and that doesn’t even count the oceans and their zonal breakdowns.
This is the same nonsense that Nick Stokes pushes, almost verbatim.
Why? What does 1000 stations give you that two cannot? If the entire globe is warming, then why can’t one or two stations track that warming?
You see, the mumbo jumbo about milli-kelvin increases is derived from dividing the variance by the √n. That doesn’t provide better precision to the mean. It only tells you that the mean is better located. They teach that in statistics but too many don’t understand
Tell us how many decimal points you would decide on and why you would do so if you had an SEM of 0.33333̅ and a mean of 3.918273645?
Supposedly—and I emphasize SUPPOSEDLY—a detectable trend in average global temperature will reflect if the Earth’s net energy balance (solar input minus Earth radiation output) is increasing or decreasing.
The associated problems are threefold:
1) The is no feasible way for scientists to convert measured temperatures into accurate sensible energy contained in the complex (i.e., non-linear), interactive with positive- and negative- and time-delayed feedbacks, and somewhat-stochaistic dynamic system that is Earth, from core to top-of-atmosphere.
2) Even if we accurately knew the sign of Earth’s net energy balance TODAY, scientists do not know how long, if ever, such would take to place the Earth into a “hothouse” or “snowball” condition, either of which would indeed represent an existential threat to mankind.
3) Most importantly, there is absolutely nothing that humans can do to changes the processes affecting Earth’s climate that will naturally evolve over time, despite the hubris of some that think otherwise (well, at least not for the next 500 years of so, IMHO).
The results speak for themselves, as linked above, so this isn’t a point of contention. An argument from personal incredulity is not compelling. You only need about 60-90 surface stations for a robust estimate. That does not mean that more stations isn’t better, especially for regional estimates.
Show the math for propagating the measurement uncertainty from the initial single measurements to the final answer.
The area of the North America continent is about 9.54 million square miles . . . so your assertion is equivalent to saying that only four temperature monitoring stations are necessary to “robustly” represent the average temperature/climate over this land area, from +7° to +85° N latitude, from sea-level to 20,300 ft elevation, from coastline to heavy forest to mountain ranges to desert to Greenland, with its 1.4 mile average depth “permanent” ice sheet.
ROTFL.
Fraud, pure and simple, and you propagate the fraud.
You/they don’t know the magnitude of these “adjustments”, might as well be goat entrails.
Yet they are calculated from local temperatures!
How do you justify using the term “data” when you know it is not based on local temperatures data?
This is the reason those of us who are trained in physical science treat data as sacrosanct. You’ve heard of chain of custody for evidence? Well, chain of custody for measurements and it’s analysis is just as important to science. Without it all you can tell folks is “trust me, it’s correct”. That’s not science, it is pseudoscience.
They are based on local temperature data. Surface temperature indexes are compiled by interpolating point-level measurements into continuous fields. To do that you have to ensure that the point-level measurements are not reflecting local changes not common to the temperature field you are sampling.
The “chain of custody” is easy to follow, so I’m not sure why that’s an issue for you. The methodology is well documented, transparent, and repeatable. The results have been independently verified by multiple groups. The raw data is readily available. It’s not clear what more you’re looking for.
Bullshit. You are a liar as well as a fraud.
Once the trends have been modified they are no longer based on the temperature measurements regardless of what you believe. In order to obtain those new trends, the measurements would require changing.
yes, Probity and provenance are what determine integrity and reliability of evidence.
Both are woefully inadequate with Temps records.
“They are data representing change to the climatology.”
Complete BS.
They no longer represent anything but a wild guess.
These air temperature trendology gyrations do not and cannot represent “the climate”.
And these fools will never acknowledge the truth, it would be the end of their cherished hockey sticks.
The “adjustments applied” are just informed guesses. With most of their staff imbibing the human-caused-climate-change swill, I wonder which way they bias their informed guesses?
They might as well use tea leaves to “adjust the data” (which is no longer data after these fools get done with it).
“WMO guidance does, in fact, not preclude use of Class 5 temperature sites – the WMO classification simply informs the data user of the geographical scale of a site’s representativity of the surrounding environment – the smaller the siting class, the higher the representativeness of the measurement for a wide area.”
In other words Class 4 and 5 sites are not useful for either homogenization or infilling to other areas. Does the Met exclude these stations from any such protocols? I doubt it.
He doesn’t understand the difference between 1% uncertainty and 5% uncertainty, they are all just numbers to be ignored.
all uncertainty is random, Gaussian, and cancels.
Use of the anomaly makes it a non-issue, homogenization of station records makes it an even smaller issue still.
Anomalies are not temperatures.
They are a ΔT based on a per station baseline. The comparison of anomalies,for example in a trend, should be done using a common baseline for the globe. Basically climate science should decide upon a global temperature that is best for the earth. No more just assuming ANY warming is bad.
How do you homogenize temperatures in San Diego with temps in Ramona, CA?
How do you homogenize temps in Hays, KS with those in Topeka, KS?
See Menne et al 2009 for details on how the algorithm functions.
Menne et al or Menne and Williams?
M&W seems to describe the algorithm, then Menne et al applied it.
BTW, the M&W layout is appalling and the results aren’t exactly stellar.
“are class 3 or 4 for temperature as a result but continue to produce valid high-quality data. “
Met office wishful thinking.
+/-a few degrees is NOT high quality data, and certainly should not be used for assessing change over time.
The question he will never answer, because he can’t.
UHI should be included, but definitely not as it is done.
A sensor in the middle of a 100 sq. mi. area that is reasonably even in terms of terrain and fauna and no human structures can be representative of the whole area. At a minimum tests should be performed to ensure the relative uniformity of the temperature in that area – multiple site tests.
A sensor in the middle of a 100 block city, likewise can be determined to be reasonably representative of the whole city with similar testing.
With both of those (and all other areas) calibrated, then the temperature records can be WEIGHTED averaged. Obviously the 100 sq. mi. would count more than 100 city blocks. In a large city a block is “typically” about 40 sq. mi.
The global average temperature does no weighting and is meaningless. Averaging the NH average with the SH average does not take into account the number of available sensors or the geography. It is a boogieman intended to frighten people into giving up their liberties.
You’ve basically described exactly the process by which the global average temperature index is determined. Stations are homogenized via pairwise comparison of difference series between each station and its neighbors, ensuring that the trends at any given station are representative of all the “space between” the stations, then stations records are combined into area weighted, gridded averages to avoid oversampling regions with high station density. The global average is then computed as the mean of these gridded averages.
And exactly how is “representative of all the “space between” tested to insure this is true?
What you are basically doing when you homogenize is saying that the station data is not fit for purpose so the data must be changed. In physical science endeavors other than climate science, data that is not fit for purpose is rejected, and not modified to something you think is correct!
These algorithms have been extensively tested on real-world and artificial datasets and have been proven to detect and remove inhomogeneities, see Menne et al 2009.
Geospatial analysis is common in many fields of science. The data are not assumed to be incorrect, they are assumed to contain non-climatic signals, which should be removed as part of the analysis of climate trends.
It is funny that you did not argue the reason for making data changes.
Attempting to change, adjust, or create data, regardless of what you call it, is implicitly admitting that the data is not fit for purpose. So somehow playing with the data is legitimate.
Geospatial analysis in other field of science basically deals with extensive measurements that can be interpolated and averaged. In most cases it deals with spatially consistent over time measurements. Temperature is more like trying to geospatially homogenize wave action in the ocean.
To deal with temperature on a geospatial basis you must include things other than just temperature. Does your homogenization algorithms include things like a Brightness Index or satellite photos whereby one can judge land use changes due to population growth? If they don’t, then you are not doing science, you are playing with inexact numbers!
Weather data is very commonly used in geospatial analysis. In this case we are simply moving from point-level measurements to estimates of an anomaly field. And if you want the anomaly field to represent change in the climatology over time, you need to extract the climate signal from the individual records. There is nothing remotely controversial about doing this. Nothing is being altered in the original archived records, they are just being transformed during the analysis.
No, but you could certainly attempt to devise such an algorithm, publish your results and methods. However, existing homogenization algorithms have, as noted above, been shown to be extremely effective at removing inhomogeneities from the station network.
“No, but you could certainly attempt to devise such an algorithm, ” What the point the so called “data” would still be junk. A measurement in weather can only tell you what’s going on at the place you took the measurement over time. It might or might not tell you what happening as little as ten miles away.
The things being combined to represent large regions are the anomalies – or how much the temperature differs from what is typical at that location during the common reference period. This value very much can be expected to correlate over large distances, and in fact this has been demonstrated empirically (Hansen and Lebedeff, 1987).
Homogenization makes the record more representative of the local climate by removing non-climatic inhomogeneities from the record. Taking the anomaly of the homogenized record then allows the record to represent change in the regional climate over time.
What you trendology ruler monkeys can’t understand is that these fraudulent Fake Data procedures increase uncertainty, they cannot decrease it!
“Homogenization makes the record more representative of the local climate by removing non-climatic inhomogeneities from the record.”
If you “believe” that, you are even more of an idiot than I thought you were.
It actually introduces non-climate tainting into data that might have actually been reasonable.
“The things being combined to represent large regions are the anomalies – or how much the temperature differs from what is typical at that location during the common reference period. This value very much can be expected to correlate over large distances, and in fact this has been demonstrated empirically (Hansen and Lebedeff, 1987).”
More junk climate science. The variance of absolute temps also determine the variance of any anomalies calculated from them. Variance of temps in river valleys is *very* different from the variance of temps taken atop hills or mountains. Therefore the correlation of those temps and anomalies is very dependent on terrain and geography. For instance, the daily diurnal range of temps is vastly different between San Diego (on the Pacific coast) from the diurnal range of temps in Ramona, CA just 30 miles away. Or Topeka, KS, smack dab in the bottom of the Kansas River valley, versus Hays, KS 400 miles away in the Central High Plains. And both are different from Wichita, KS on the north end of the Southern Plains.
The anomalies at these locations may have the same slope over time but their values will be quite different. Meaning the “global average anomaly” is useless for actually determining how much “climate change” is actually happening.
It’s why I’ve long advocated for just assigning a sign to locations. A plus if it is going up, a minus if going down, and a 0 if stagnant. Add’em up and you can say “we’ve got more pluses than minuses but we don’t actually know how much change is happening globallly”.
If the slopes are the same then the expressed rate of change is the same, so they are extremely useful for determining how much change is occurring.
“The anomalies at these locations may have the same slope over time but their values will be quite different.”
I was imprecise. The SIGN of the slopes may be the same. But as I said: “their values will be quite different.”
But you have yet to demonstrate any evidence which of those inhomogeneities are natural and which are errors that need the corrections. You are simply saying the computer smooths as it was programmed. All you are saying is “trust me, it is correct”.
Inhomogeneities are not errors, they are just non-climatic elements of the station record that reflect site-specific changes (station moves, instrument changes, urban buildup) rather than broader climatic characteristics.
The computer does exactly as it was programmed to do, which is to identify and adjust breakpoints via pairwise comparison of difference series. That this quite effectively removes inhomogeneities from the network has, again, been empirically demonstrated in Menne et al, 2009. Claiming that this has not been demonstrated is to reveal a great ignorance about the topic under discussion, and I recommend you familiarize yourself with the primary literature before continuing.
Homogenisation routines are so haphazard that they can produce a different temperature fabrication for an individual site, basically every time they are run. has been shown.
They can produce changes that have absolutely no real justification
They are scientifically a complete FARCE. !!
You’ll be good enough to provide a citation.
You are using the argumentative fallacy of Equivocation. You are trying to substitute homogeneity with measurement uncertainty. They are *not* the same thing and one doesn’t define the other.
Anomalies inherit the variances of the absolute temps. Yet climate science does no weighting when combining temps/anomalies that have different variances, e.g. northern hemisphere vs southern hemisphere.
The algorithm is useless from a physical science viewpoint.
I temperature measurement only tell you what is happing at that place and time, not what happening a quarter mile away, let alone between widely dispersed measuring points. I have personally felt temperature swings of over 40 F in a little over a hundred mile drive. In a relative flat piece of country and all of it mixed prairie/pasture and farm land, the funny thing is the front move very little all day. It also dissipated over night. I am talking 104 F as apposed to 68 F(did not have AC in my vehicle at the time though I had the heater was on at first). Homogenize is a sad joke used by fools to make up fake data instead of trying to produce a network of temperature sensors of fine enough resolution that might tell you something about what going on in about a hundred years of said measurements. Even then the errors will be well over + or – one degree for each measurement.
No. That is not what I said.
Perhaps instead of a terse reply you can elaborate and explain why the above approach is inadequate in your view.
A misplaced response made below.
“I got the record highs and lows for my city, real time, back in 2007. (From NOAA)
I got them again in 2012.
About 10% had been “adjusted”.
A new record high for day of the year in 2007 was “broken” by a lower temperature in an earlier than 2007 year. Same for record lows.
(If I could figure out how to display a table as the old “Pre” formatting allowed, I’d put up a summary of the comparisons.
Even Nick Stokes admitted the values had been changed. (With the caveat he didn’t know the source. I neglected (honestly) that the values were from NOAA.)”
So “homogenizing” justifies changing measured values?
How does the “homogenizer” know the difference between a measured value from a “homogenized” value the next time it is run?
PS The 2012 records I mentioned? I got them twice, In April and then in July. Even in that short span the values were changed AGAIN.
Who was the boss of NOAA back then?
I don’t know where you obtained the records you reference, but the GHCN archives all of the raw station records, without any homogenization.
NOAA via the National Weather Service.
Cold temps have higher variance than warm temps. When jamming cold temps and warm temps together into a global average then the temps should be weighted to account for the difference in variance. This variance carries over from the absolute temps to the anomalies. Yet climate science just jams northern hemisphere and southern hemisphere temps/anomalies together with no weighting.
The entire edifice of climate science and CAGW is a statistical mess.
Has any city ever had a thermal profile?
One might think that should be done to characterize each cities UHI effect.
Or would that require modelers getting away from their computers and doing real science?
Since you asked, here’s just one website discussion on this topic:
“At their final review held at ESA’s ESRIN site Frascati, Italy, the Urban Heat Islands and Urban Thermography team presented its findings on how remote sensing allows the continuous monitoring of thermal radiation emitted by urban surfaces.
“The project analysed trends in heat distribution over 10 European cities – Athens, Bari, Brussels, Budapest, Lisbon, London, Madrid, Paris, Seville and Thessaloniki – over the last 10 years, using multiple sensors.”
(https://www.esa.int/Applications/Observing_the_Earth/Satellites_predict_city_hot_spots )
I got the record highs and lows for my city, real time, back in 2007. (From NOAA)
I got them again in 2012.
About 10% had been “adjusted”.
A new record high for day of the year in 2007 was “broken” by a lower temperature in an earlier than 2007 year. Same for record lows.
(If I could figure out how to display a table as the old “Pre” formatting allowed, I’d put up a summary of the comparisons.
Even Nick Stokes admitted the values had been changed. (With the caveat he didn’t know the source. I neglected (honestly) that the values were from NOAA.)
OOPS!
This was meant as a reply to AlanJ, not Sparta Nova 4.
Perhaps the MODs can “adjust it”? 😎
Or maybe it works for both. Agree with Sparta, AlanJ, “lot of esplainin to do”!
From the above article:
“t’s almost as if the Met Office is actively seeking higher readings to feed into its constant catastrophisation of weather in the interests of Net Zero promotion.”
BINGO!
That, and—of course—to bring in payola.
Junk Temperature Measuring Network?? But the Met Office and BoM are like my tyre lever jockey who’ve all got off analogue for the precision of digital and decimal places rather than guesswork between graduations. What more could you ask?
Devil’s advocate here. Those “junk” stations are still necessary for local measurements and forecasting but they should never be included in a long term dataset purporting to measure “climate change.” Is there a dataset that excludes them? Can we get access to the raw data and remove those stations? I’d be curious to see the result.
Remember, temperature is the poorest possible proxy that could be used for describing climate. First, climate science doesn’t even study Tmax separately from Tmin, they find the mid-point between the two (which is *NOT* an average) thus losing all information necessary to actually understand what is going on with temperatures. Second, the mid-point temp cannot distinguish climate. Las Vegas and Miami can have exactly the same mid-point temps while having vastly different climates. If your proxy can’t distinguish between climates then how can it determine “climate change”?
Chris,
I’m not sure what you mean by “Only classes 1 and 2 have no uncertainties attached”.
Surely every measurement has uncertainty. It is a problem the the concept of uncertainty is poorly understood, even by people who should be competently educating others.
WUWT has carried many articles about uncertainty, but some readers seem to be.failing to learn. Pat Frank has written relevant educational material.
Geoff S
Classes 1 and 2 have no additional uncertainty shown. NOAA shows ±1.0°C for ASOS stations. The additional uncertainty would be added, probably using RSS.
Uncertainty is well beyond the ken of the trendology ruler monkeys, this has been demonstrated again and again. They refuse to even consider the basic concepts.
Chris,
We sceptics are not doing a good job with airport heat.
I do not know if the quantity of fuel burned at a typical airport is adequate to change the temperature in the way you claim.
I have never seen a sensitivity study that uses a range of values to model (yes, models are needed) an airport area, slices of air above that area, fuel weight burned by time, heat of combustion and various assumptions for rate of loss of heated air from the modelled slices over time.
It is possible that this modelling has been done but I have missed it. Links appreciated.
I have this embarrassing scenario with your Met Office knowing that the burned fuel is too small to matter, having a quiet chuckle each time we yell at them for being sloppy. We have to prove such things, not guess at them.
Similarly, we sceptics need measured data on the air temperature at various distances from the exhausts of tethered aircraft. Do you have any data showing that temperature changes become too small to matter at 5, 10, 20, 50, 100 or whatever metres from a Jumbo jet in taxi mode?
Geoff S
I remember a post here from several years ago where a record high was set in (I think), Boston. Maybe NYC?
Someone took a very close look. Turns out the “record high” reordered, at the airport, perfectly coincided with a temp spike and a brief wind shift that blew the air from an idling jet on the runway to the sensor site.
No surrounding stations showed such a spike.
Same here in Brisbane, some time ago now, but just bet nothing has changed. Sitting in a plane in a line, the plane stopped, other traffic ahead, looked out the window and there was a big sign on a fence telling people to stay out. What was behind the fence the bureau of Metrology weather station. We then moved foward and turned to the right leaving the weather station right in the plane engine blast, nice hot jet engines. We must have sat there for 5 mins or more, wonder if the other planes lined up to take off did the same. Must have some interesting readings from those monitors.
Gavin would like you all to know he hasn’t got a clue what’s going on but if you could all run about in circles panicking and sending more grants it would be most appreciated-
‘We should have better answers by now’: climate scientists baffled by unexpected pace of heating | Climate crisis | The Guardian
The people at the MET are hired and paid to do honest work and reporting. The verdict is in, they aren’t doing either. All top managers should be fired and blackballed from ever holding a government job or a job that works with the government. All the money saved by firing these clowns should be spent removing all stations with a class four or five rating. Those in line to replace those fired should be the ones to remove the bad stations. Remind them while they are removing the rations that this is what their replacements will be doing if they don’t perform their work correctly and honestly.