Surface Air Temperature Trends, Climate Models vs Observations, 1979-2025

From Dr. Roy Spencer’s Global Warming Blog

Roy W. Spencer, Ph. D.

This is just a short update regarding how global surface air temperature (Tsfc) trends are tracking 34 CMIP6 climate models through 2025. The following plot shows the Tsfc trends, 1979-2025, ranked from the warmest to the coolest.

“Observations” is an average of 4 datasets: HadCRUT5, NOAAGlobalTemp Version 6 (now featuring AI, of course), ERA5 (a reanalysis dataset), and the Berkeley 1×1 deg. dataset, which produces a trend identical to HadCRUT5 (+0.205 C/decade).

I consider reanalyses to be in the class of “observations” since they are constrained to match, in some average sense, the measurements made from the surface, weather balloons, global commercial aircraft, satellites, and the kitchen sink.

The observations moved up one place in the rankings since the last time I made one of these plots, mainly due to an anomalously warm 2024.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
5 19 votes
Article Rating
84 Comments
Inline Feedbacks
View all comments
OldRetiredGuy
January 9, 2026 2:05 pm

Does this “Observation” number include all the defective sites?

Laws of Nature
Reply to  OldRetiredGuy
January 9, 2026 3:11 pm

Do the models include the Tonga eruption?

Reply to  Laws of Nature
January 9, 2026 5:47 pm

Models don’t include El Nino events….

… yet UAH data shows they are the source of all warming in the last 46 years.

Reply to  OldRetiredGuy
January 9, 2026 5:46 pm

“Observations” are based to a very large percentage, on sites that are totally unfit for the purpose of measuring changes of temperature over time.

At least 3 surveys have been carried out that prove this…

Our host’s “Surface Station” survey

Ken’s Kingdom in Australia

And Ray Sanders’ work on the woeful Met Office sites, that can be found at Tallboys blog.

There may be other surveys elsewhere… but I doubt the outcome will be less awful.

As well as that, there is a large amount of fake data, and deliberate “adjustments” that increase the warming trend.

sherro01
Reply to  bnice2000
January 10, 2026 7:47 pm

bnice,
Here is an analysis of Australian data showing how wrong the official warming figure is.
It is wrong because they reject original, raw data and use adjusted data, adjusted by subjective methods that have no place in this type of prediction.
Next thing, they will be claiming that it is OK to use GUM style uncertainty estimates on adjusted data. Sorry guys, proper science does not allow that for guesses.
Geoff S
https://www.geoffstuff.com/halfwarm.docx

roywspencer
Reply to  OldRetiredGuy
January 11, 2026 12:51 pm

Yes.

gyan1
January 9, 2026 2:48 pm

Surprised there are 8 models projecting less warming than observations. I’ve never heard of those before. Does the IPCC even acknowledge them?

Nick Stokes
January 9, 2026 2:51 pm

In rank terms, “observations” are toward the lower end. But the mid-range values are less than 10% higher than the blue. That seems pretty good to me..

Reply to  Nick Stokes
January 9, 2026 3:27 pm

Question. Are the ‘79 – ‘25 trends from the models based on their pre-‘79 predictions or do they incorporate known data from that interval? I suspect the latter, in which case I wouldn’t be too impressed.

Reply to  Frank from NoVA
January 9, 2026 4:25 pm

Here it is given that the CMIP6 cycle of modeling used 1850 to 2014 as the historical period of observations.
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019MS002034

Reply to  David Dibbell
January 10, 2026 4:05 am

To clarify, I note that the historical observations are not direct inputs to the modeled evolution of climate states in the iteration. Nick Stokes makes this point below. I also note that tuning strategy is a deep subject, as explained here. Search within this document on the word “tuning” and you’ll see what I mean.

https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017MS001209

Here is an important point in that document:
“There is an interesting difference between groups that tune energy balance for PD (present-day) conditions and those that tune for PI (preindustrial) conditions. Our tuning has been entirely based on the standard AMIP period (1980–2014), tuning to estimates of PD energy balance using observed SSTs as boundary conditions.”

So the recent historical period including satellite observation is used in this case for tuning the model.

“AMIP” refers to the evaluation of the Atmosphere component of the models as opposed to CMIP, which evaluates the Coupled models.

Nick Stokes
Reply to  Frank from NoVA
January 9, 2026 4:30 pm

do they incorporate known data from that interval? “

No they don’t, else you wouldn’t get the very warm ones. That is not how GCM’s work. They are initialised from data, usually pre 1900, and left to run. They can’t do anything else – there is no provision for including observations along the way (that would be reanalysis).

Laws of Nature
Reply to  Nick Stokes
January 9, 2026 5:42 pm

Sometimes model tuning is described as an art, indicating that the tuner has an influence on the results of the model. If that is the case not only the current knowledge, but also the believe of tuner is part of the model. IMHO all GCM mentioned in the IPCC reports are on the alarming side, but it is certainly the case the current knowledge is the world and warming is used in the tuning process.

Nick Stokes
Reply to  Laws of Nature
January 9, 2026 7:06 pm

it is certainly the case the current knowledge is the world and warming is used in the tuning process”

It is certainly not. You have no idea how tuning works.

Reply to  Nick Stokes
January 9, 2026 10:51 pm

Tuning is a suspected method which you are still in thrall over.

Reply to  Nick Stokes
January 9, 2026 11:16 pm

You CANNOT tune to past data when that data is totally BOGUS, and has a heavily manipulated warming trend built into by the alarmist agenda..

…and expect anything even remotely meaningful.

That is true ANTI-SCIENCE.

Reply to  Nick Stokes
January 10, 2026 12:57 am

Funny how model tuning is only ever practiced in Climate “science”.

Reply to  Nick Stokes
January 9, 2026 5:54 pm

The data they initialise to is mostly “totally made up” and does not resemble actually temperature measurements anywhere.

There is basically no “global data” pre 1900, and no usable surface for most 70% of the planet before about 2005 and the implementation of ARGO.

Even now, there are huge areas of the surface which have no measurements and where measurements that do exist could only be counted as “junk data” (eg UK Met)

Just one of the many reasons that the models have no more relevance to reality than a low-level computer game.

Michael Flynn
Reply to  Nick Stokes
January 9, 2026 7:09 pm

They are initialised from data, usually pre 1900, and left to run.

Go on, tell us that any “ridiculous” results are not discarded. A pack of ignorant and gullible fools, pretending that their sloppy amateur attempts at computer programming (I’m referring to Gavin Schmidt here) represent anything except wishful thinking and fantasy.

You might believe that the outcomes of a deterministic chaotic system can be predicted any more skilfully than a smart 12 year old can do, or even that adding CO2 to air makes thermometers hotter!

Why would anybody value the opinion of a person who is obviously ignorant and gullible, as well as being divorced from reality? You don’t have to answer.

leefor
Reply to  Nick Stokes
January 9, 2026 7:34 pm

Pre-1900? When there really wasn’t global coverage? lol.

Nick Stokes
Reply to  leefor
January 9, 2026 8:44 pm

Yes, exactly. You too have no idea how tuning works. It adjusts parameters that are poorly known so that some particular past value is matched. Usually something like SST somewhere. It isn’t trying to align some variable history.

leefor
Reply to  Nick Stokes
January 9, 2026 9:31 pm

How can you match an unknown parameter? 😉

Reply to  leefor
January 9, 2026 11:11 pm

They just MAKE IT UP until they get an answer they like.

We all know the “past values” are also meaningless, and based on basically no real data.

” Usually something like SST somewhere.”

You mean another number pulled out of …… somewhere ??

How is that even remotely appropriate for “global” models ?

Reply to  Nick Stokes
January 10, 2026 5:59 am

Sounds much like how we adjust “poorly known” (many of them, more than with AGW models) cell parameters, to better simulate petroleum reservoir and production performance. In spite of this faux observation.

https://wattsupwiththat.com/2026/01/09/surface-air-temperature-trends-climate-models-vs-observations-1979-2025/#comment-4152125

Trillions in $ value addition, spread around, if not well enough…

Reply to  Nick Stokes
January 10, 2026 7:08 am

You too have no idea how tuning works. 

As you say, “parameters that are poorly known”. So what tuning does is like pushing a round cylinder in a square hole because it is small enough to do so.

Most people would call that guessing because you don’t know exactly where it should go.

Somehow I am reminded of Dr. Pat Frank’s analysis and conclusion about GCM’s having a ±15C uncertainty interval. Nothing you have said here gives me doubts about that conclusion.

gyan1
Reply to  Jim Gorman
January 10, 2026 10:06 am

A key point! Many papers have shown that the uncertainties in model output are 10-100x the tiny forcing they are trying to isolate. It’s preposterous pseudoscience to make any conclusions when the signal is deeply buried in the noise. “We don’t know” is the honest answer.

Reply to  gyan1
January 10, 2026 1:52 pm

 “We don’t know” is the honest answer.”

Hallelujah! Someone that gets it! If the differences you are attempting to discern are less than the uncertainty then that difference will forever remain in the dimension named “THE GREAT UNKNOWN”. No amount of averaging will help penetrate THE GREAT UNKNOWN.

The Nordic rune set has a blank rune – usually interpreted as the Great Unknown and sometimes like “Ke Sera Sera, whatever will be will be, it’s not ours to see, Ke Sera Sera”.

The climate models and modelers along with climate science think they can penetrate the meaning behind that blank rune. They can’t.

Reply to  Nick Stokes
January 9, 2026 10:57 pm

They are initialised from data, usually pre 1900, and left to run.

How is this possible, Nick, there is no global, reliable, measured temperature before 1900?

Pre-1920 global measurements.

Coverage gaps:

  • Very sparse data from the Southern Hemisphere, oceans, Africa, South America, and polar regions
  • Measurements concentrated in Europe and parts of North America
  • Ocean temperature data extremely limited before systematic ship measurements

Instrument problems:

  • Non-standardized thermometers with varying accuracy
  • Inconsistent measurement methods (time of day, shelter types, locations)
  • Urban heat island effects not well understood or corrected
  • Changes in measurement techniques over time

Data quality:

  • Many stations have incomplete records
  • Observer errors and inconsistent protocols
  • Poor documentation of station moves or instrument changes

If you cite the reliable USA record, then surely you should need to explain high temperatures in the 1930s, which defeats the “hotter than evah!” narrative.

Nick Stokes
Reply to  Redge
January 10, 2026 12:39 am

The point of winding back is that the now time result doesn’t depend much on the initial state Just as we knowreal climate in 2025 doesn’t depend on what happened in 1880. You need an initial state which is physically consistent, so it won’t blow up.

Reply to  Nick Stokes
January 10, 2026 12:47 am

Without imperial data from across the globe, the initial state is guesswork, therefore any results from models is also guesswork.

Reply to  Nick Stokes
January 10, 2026 5:55 am

In other words the modelers have to GUESS at actual history in order to have initial conditions that don’t cause the model to blow up!

Exactly what range of initial conditions cause the models to blow up? Inquiring minds want to know?

Reply to  Nick Stokes
January 10, 2026 7:23 am

You need an initial state which is physically consistent, so it won’t blow up.

Circular logic. How do you know that the initial state represents anything physical if you don’t have physical measurements to go by.

What you are doing is “creating” what you think is physically consistent. More guessing of parameter values. Is it any wonder that people criticize the uncertainty of the outputs? At the rate the models are progressing toward accurate predictions it will take 500 years, and that is only guessing, it could be longer.

Sparta Nova 4
Reply to  Nick Stokes
January 12, 2026 10:37 am

Just as we knowreal climate in 2025 doesn’t depend on what happened in 1880.

My magic decoder ring just blew up.

sherro01
Reply to  Nick Stokes
January 10, 2026 7:53 pm

Nick,
Here is some neat imaging of math that has a quite informative section on the importance of initial conditions in a somewhat “pure” setting.
BTW, also born 1941. June 16 at Oakey, Qld.
How are you going with getting older graciously?
Geoff S

https://youtu.be/8jVogdTJESw

Nick Stokes
Reply to  sherro01
January 11, 2026 9:03 pm

Hi Geoff,
Interesting time to ask. Reply has been delayed by my 80th birthday todo.

Reply to  Nick Stokes
January 9, 2026 4:02 pm

Except the IPCC averages the ludicrously high ones with all the others to arrive at an inflated number.

Seems pretty bad to me, averaging models that are 2X observations with ones that are 10% higher. In fact its not bad, its malfeasance.

Nick Stokes
Reply to  davidmhoffer
January 9, 2026 4:34 pm

Except the IPCC averages the ludicrously high ones with all the others”
Really? Reference please. Bloggers sometimes like to do that, but the IPCC?

leefor
Reply to  Nick Stokes
January 9, 2026 7:37 pm

Actually you may be right. They just top and tail the models. Not too much, not too little. 😉

Reply to  leefor
January 9, 2026 8:11 pm

nick is suggesting that they “chose” the models they want to include.. says it all. !

Choose what best matches the faked surface data.

Junk models, cherry-picked, against mal-adjusted surface data.

Reply to  bnice2000
January 9, 2026 10:52 pm

But they are all tuned which Nick likes a lot because it sounds smart!

Nick Stokes
Reply to  leefor
January 9, 2026 8:45 pm

Nobody seems interested to find out what they actually do. Just make it up.

leefor
Reply to  Nick Stokes
January 9, 2026 9:33 pm

You mean like the models?

Reply to  Nick Stokes
January 9, 2026 11:06 pm

How many “parameters” do they use Nick.. enough to wiggly the elephants tail?

The models are JUNK from the ground up.

Hindcast data is also JUNK.

Scientifically, the whole exercise is totally pointless.

Reply to  Nick Stokes
January 10, 2026 8:35 am

Here is what I want to know. Temperature is not a very good proxy for heat. If heat enters into the “physics” calculations of the models, it must calculate the enthalpy that goes with each temperature location. Where does that information originate or is it tuned?

We have sufficient measurements of temperature and concurrent humidity in many regions today. How do the results of the model’s enthalpy and calculations of enthalpy from actual measurements match up?

Reply to  leefor
January 10, 2026 3:25 am

They just top and tail the models.

Not sure about “just”, but the IPCC does indeed, for the AR6 WG-I report at least, “top and tail” the model ranges for several variables, especially ECS.

NB : Nick Stokes is correct as well, the IPCC does not simply “average[] the ludicrously high ones with all the others”.

This is shown most starkly in FAQ 7.3 Figure 1, on page 1025 of the AR6 WG-I assessment report :

comment image

From the preceding text on page 1023 (highlighting added by me) :

All four lines of evidence rely, to some extent, on climate models, and interpreting the evidence often benefits from model diversity and spread in modelled climate sensitivity. Furthermore, high-sensitivity models can provide important insights into futures that have a low likelihood of occurring but that could result in large impacts. But, unlike in previous assessments, climate models are not considered a line of evidence in their own right in the IPCC Sixth Assessment Report.

The ECS of the latest climate models is, on average, higher than that of the previous generation of models and also higher than this Report’s best estimate of 3.0°C. Furthermore, the ECS values in some of the new models are both above and below the 2°C to 5°C very likely range, and although such models cannot be ruled out as implausible solely based on their ECS, some simulations display climate change that is inconsistent with the observed changes when tested with ancient climates. A slight mismatch between models and this Report’s assessment is only natural because this Report’s assessment is largely based on observations and an improved understanding of the climate system.

.

To see why the media tends to focus on how “accurate” the climate models are in relation to (global mean / average) surface temperatures (GMST / GAST), and not so much on other variables / parameters, you just need to look at FAQ 3.3 Figure 1, on page 520 :

comment image

From the main body of that FAQ, on page 519 (with some extra “line feed” characters added by me along with the highlighting…) :

Scientists evaluate the performance of climate models by comparing historical climate model simulations to observations. This evaluation includes comparison of large-scale averages as well as more detailed regional and seasonal variations. There are two important aspects to consider:
(i) how models perform individually and
(ii) how they perform as a group.

The average of many models often compares better against observations than any individual model, since errors in representing detailed processes tend to cancel each other out in multi-model averages.

.

The actual contents of the IPCC WG-I, “The Scientific Basis”, reports is usually quite good, with the inclusion of many “uncertainty” discussions by the scientists who write the “main bodies” of each individual chapter.

It is unfortunate that these “uncertainties” are mostly stripped out in the “line-by-line approved by government representatives” SPMs.

Reply to  Mark BLR
January 10, 2026 6:18 am

since errors in representing detailed processes tend to cancel each other out in multi-model averages.””

And once again, the garbage climate science meme of “uncertainty is random, Gaussian, and cancels”.

Uncertainty (even if couched in terms of “error”) ADDS, it doesn’t cancel. The standard deviation of the data (i.e. the model outputs) is the uncertainty AND error factor. Standard deviation doesn’t provide for “cancellation”.

If you look at the distribution of the data (i.e. the model outputs plus obervations) it is highly skewed. That also legislates against “cancellation of errors” being the case. It even makes calculation of a “standard deviation” a pretty meaningless exercise since the average of the data and the mode of the data will not be the same and the standard deviation may very well not include 68% of the data. In fact, it probably won’t.

There are ways to estimate the uncertainty of skewed data. But climate science refuses to use them, instead depending on the meme that “all uncertainty is random, Gaussian, and cancels”.

leefor
Reply to  Nick Stokes
January 9, 2026 8:09 pm

“Annual global land precipitation will increase over the 21st century as GSAT increases (high confidence). The likely range of change in globally averaged annual land precipitation during 2081–2100 relative to 1995–2014 is –0.2 to +4.7% in the low-emissions scenario SSP1-1.9 and 0.9–12.9% in the high-emissions scenario SSP5-8.5, based on all available CMIP6 models. ” So using the models to average.

leefor
Reply to  leefor
January 9, 2026 8:19 pm

Or the use of ensembles.

Nick Stokes
Reply to  leefor
January 9, 2026 8:49 pm

No, using models to estimate a range.

leefor
Reply to  Nick Stokes
January 9, 2026 9:37 pm

Ah an estimation. Such sad commentary on science. What are error bars on these “estimations”?

Nick Stokes
Reply to  leefor
January 10, 2026 12:34 am

It’s right there in your quote.

Reply to  Nick Stokes
January 10, 2026 6:21 am

So you are saying the uncertainty can be as much as 5% to 13%?

And climate science expects us to put any kind of faith in the model outputs?

Reply to  Nick Stokes
January 9, 2026 11:08 pm

No.. they are using low-end computer games to “make-up” whatever they want.

Reply to  Nick Stokes
January 9, 2026 11:31 pm

How many spaghetti graphs have we seen with an average line drawn through them?

Answer: Lots.

Nick Stokes
Reply to  davidmhoffer
January 10, 2026 12:35 am

But by who?

MarkW
Reply to  Nick Stokes
January 9, 2026 4:43 pm

If you throw out all the bad data, any dataset can be made to look good.

Reply to  Nick Stokes
January 9, 2026 5:49 pm

You do know those “observations” are totally bogus in the first place, don’t you !

Tom Johnson
Reply to  Nick Stokes
January 9, 2026 6:54 pm

Would you fly from New York to Australia in a “pretty good” airplane?

Reply to  Tom Johnson
January 9, 2026 10:54 pm

Sure if the Airplane is well “tuned“…….

Reply to  Nick Stokes
January 10, 2026 12:56 am

 That seems pretty good to me..

Rubbish. This whole exercise is utterly pointless given that we don’t know what is causing the observation and if anyone tells they do know, they’re delusional or lying. GCMs have been less than useless. Making an income from them is a total scam and criminal theft from the the taxpayer.

Reply to  Nick Stokes
January 10, 2026 5:52 am

In rank terms, “observations” are toward the lower end. But the mid-range values are less than 10% higher than the blue. That seems pretty good to me..”

If the difference from the observational trend is considered to be an UNCERTAINTY factor, then a 10% uncertainty is HUGE!

Would you accept a 10% difference between what the electric utility charges you and what you actually used to be acceptable – especially if their bill runs 10% high?

Reply to  Nick Stokes
January 10, 2026 7:10 am

Figures lie and liars figure.

If you use the blue value of about 0.225 and a midrange value for the ones above that you get about 0.325. The difference is 0.1 which gives about a 45% higher value rather than 10%. Why am I not surprised you would find a method playing with numbers that gives value that appears better.

In addition, just looking at the values it appears that they are skewed badly towards warming. Again, your attempt to minimize the differences is questionable at best.

Nick Stokes
Reply to  Jim Gorman
January 11, 2026 9:10 pm

I’ve no idea where you get those figures. Here is the pplot with the blue level in yellow and a line 10% higher.

comment image

Reply to  Nick Stokes
January 12, 2026 9:45 am

Draw your top line at .325. As I said that’s a midline value for the models that are warmer than the blue one. Why do you want to eliminate them

John Power
January 9, 2026 4:47 pm

“I consider reanalyses to be in the class of “observations” since they are constrained to match, in some average sense, the measurements made from the surface, weather balloons, global commercial aircraft, satellites, and the kitchen sink.
 
At last I understand why the IPCC claims to be reporting ‘gold standard climate science’ from around the world!
 
But I counted 26 models giving Higher trends than Observations and only 8 giving Lower ones. Shouldn’t there be 17 Higher and 17 Lower if the models were unbiased?

Reply to  John Power
January 10, 2026 6:46 am

You have just defined a highly skewed distribution. The average (mean) of a highly skewed distribution is a useless statistical descriptor. Since the mean is a base for calculating the standard deviation of a distribution, the statistical descriptor “standard deviation” is useless as well.

That also means that the “average” of the data (i.e. the model outputs) known as the “ensemble” is useless as well.

Climate science SHOULD, let me emphasize SHOULD, start using the 5-number statistical description of all of their data. That allows generating an uncertainty interval by subtracting the 3rd quartile from the 1st quartile. It only encompasses 50% of the possible values instead of 68% but it would be a far better descriptor of the uncertainty associated with the data than the standard deviation.

Shouldn’t there be 17 Higher and 17 Lower if the models were unbiased?”

It’s a climate science lie that the models are not biased. Parameterizations are, by definitions, bias. They are guesses and guesses are subjective, i.e. they are biased. If the parameterizations were based on actual data then the actual data should be used in the models and not parameterizations.

John Power
Reply to  Tim Gorman
January 10, 2026 6:23 pm

As you say, Tim, the distribution is highly skewed. I note that it’s skewed towards the High end of the range, too, which I think provides strong evidence that the CMIP6 models are ‘running hot’ as was the CMIP5 generation before them. However, if I remember correctly, the numbers were even more skewed in CMIP5, so perhaps we should allow the modellers a little progress in the right direction, albeit glacially slow, if we can assume that the baseline reference-standard of ‘Observations’ is stable and trustworthy (which we can’t of course).

“Climate science SHOULD, let me emphasize SHOULD, start using the 5-number statistical description of all of their data. That allows generating an uncertainty interval by subtracting the 3rd quartile from the 1st quartile . It only encompasses 50% of the possible values instead of 68% but it would be a far better descriptor of the uncertainty associated with the data than the standard deviation.”

I’m not sure I understand exactly what you’re proposing here, Tim, but it does sound potentially very useful if the amount of information that is added by doing it is greater than the amount that would be lost by its only encompassing 50% of the data. If you can expound the idea in greater detail, we might be able to assess it properly from an Information theory perspective.
 
“It’s a climate science lie that the models are not biased.”
 
If so, it is a lie that is flatly refuted by the available evidence as we have just seen.
 
“If the parameterizations were based on actual data then the actual data should be used in the models and not parameterizations.”
 
I don’t doubt that the parameterizations are indeed mere guesses as you say and I think they are used in the models precisely because they are not based on any valid data, because the modelers don’t have any and can’t get any either, because the available technology and methodology are not up to the superhuman job of producing it on a global scale at the present stage of their evolution. Consequently, the models are just castles in the air and the stuff that dreams are made on at the end of the day, it seems to me.

Reply to  John Power
January 11, 2026 8:12 am

I’m not sure I understand exactly what you’re proposing here, Tim, but it does sound potentially very useful if the amount of information that is added by doing it is greater than the amount that would be lost by its only encompassing 50% of the data. 

Maybe I can help explain. If I give you two statistical parameters of mean – 70 and the standard deviation is 5, what is the first thing you visualize? Most folks will see in their heads a normal curve with the peak point being 70 and the standard deviation range going from 65 to 75. How would anyone with just those two numbers visualize a skewed distribution? You can’t, there isn’t enough information.

A five number plot gives you 5 statistical parameters one can use to visually see how the distribution is shaped. Those parameters are:

  • Minimum – the smallest number in the data set.
  • First Quartile the 25th percentile; the value below which 25% of the data fall.
  • Median – the 50th percentile; the middle value of the data.
  • Third Quartile – the 75th percentile; the value below which 75% of the data fall.
  • Maximum – the largest value in the dataset.

A 5 number plot is also known as a box & whisker plot. The box will contain the values from the 1st quartile to the 3rd quartile which by definition contain 50% of the data. The median position in the box indicates skewness. If the distribution is normal the median will be in the middle of the box, and will equal the mean. If the median is offset from the middle, that indicates skewness which should also mean an asymmetric uncertainty interval. However, the 50% of data in the box will still be true.

John Power
Reply to  Jim Gorman
January 17, 2026 6:57 pm

Hi Jim. Sorry it’s taken me a week to reply – I’ve been snowed under with work suddenly and am still in the throes of catching up and clearing the backlog.
 
Thanks for explaining the ‘5-number’/‘box & whisker’ concept. I was completely unaware of it beforehand and I appreciate your taking the trouble to enlighten me about it.
 
I’m still a bit puzzled about the position of the median in the box though. Since you have defined the median as being located at the 50th percentile – (a definition with which I concur, by the way) – and the box extends from the 1st quartile (i.e. 25th percentile) to the 3rd quartile (75th percentile), surely the location of the median must be fixed permanently at the centre of the box by definition? But in your last paragraph you say:
“The median position in the box indicates skewness.”
Did you mean to say, perhaps:
“The median position in relation to the mean indicates skewness.”?
 
I’ll reply to Tim a.s.a.p.. Thanks again for this, Jim.

Reply to  John Power
January 18, 2026 5:25 pm

Did you mean to say, perhaps:

“The median position in relation to the mean indicates skewness.”?

Nope. The median can be anywhere in the “box”. The mean really doesn’t come into this analysis. It is another way to visualize the shape of the distribution.

The box is formed by the median of the bottom half of data. That is the data below the median. And the median of the top half of the data. That is the data above the median.

The whiskers go from the junction of the 1st/2nd quartiles to the minimum value and the junction of the 3rd/4th to the maximum value.

Reply to  John Power
January 11, 2026 8:59 am

Measurement uncertainty is defined in the GUM as a standard deviation. That’s kind of a limited definition that was settled on so everyone would have a common definition.

If measurement uncertainty intervals are given as a metric for accuracy useful for evaluating later measurements and give a range of possible values that are considered acceptable, then using a 50% interval instead of a 68% interval are both useful metrics. The 68% interval really only applies to a normal distribution while the 50% interval applies to distributions in general. If you want to increase the 50% interval then use an adjustment factor similar to the “coverage” factor used in the GUM.

Mean/standard deviation is really only useful when you have a normal distribution. The 5-number is a far more general statistical descriptor.

Climate science likes the meme of “all measurement uncertainty is random, Gaussian, and cancels” so they can use the SEM as a measurement uncertainty of the mean. The problem is that in the real world, measurement uncertainty is rarely Gaussian. Measurement instruments of a similar type typically tend to see calibration drift in the same direction because of heating of internal components. Heat generally expands materials causing the same drift direction for similar materials. Thus the measurement uncertainty overall is an asymmetric interval. A 5-number statistical descriptor works with asymmetrical intervals, mean/standard deviation statistical descriptors don’t.

John Power
Reply to  Tim Gorman
January 17, 2026 7:09 pm

Thanks, Tim, for drawing my attention to these important issues in metrology, climate science and science generally. Sorry it’s taken me so long to reply – as I explained to Jim (just above), I’ve been snowed under with work suddenly and I am still in the throes of catching up and clearing the backlog, in fact.
 
As you say, “Measurement uncertainty is defined in the GUM as a standard deviation.”
 
It seems unfortunate to me that the GUM has not made it more clear that this is not actually a definition of measurement uncertainty per se but is only a definition of a standard metric for uncertainty that can be applied generically to data that conform to the Normal/Gaussian distribution but not necessarily to other distributions. This must be confusing to a lot of people, I imagine – especially to those who have never studied statistics and don’t know what a standard deviation is to begin with.
 
Come to think of it, the GUM’s definition is even an invalid metric for uncertainty in the Gaussian context for which its use is primarily intended, because the metric does not behave according to the same intrinsic laws as uncertainty, whose magnitude remains rigidly fixed at a confidence level of 68% while the magnitude of the metric may vary from measurement to measurement!   
 
Many thanks to you for explaining the advantages of the 5-number/Box&Whiskers scheme over the traditional Normal/Gaussian scheme so beloved of ‘climate scientists’. I agree that they should start using it and the sooner the better because I think it would help to expose the implicit assumption that “all measurement uncertainty is random, Gaussian, and cancels” to the remorseless light of rigorous statistical testing.
 
I don’t mean to say that I think they should stop using the Gaussian descriptors and start using 5-number descriptors instead, but rather that I think they need the information that the 5-number descriptors could provide to show up any skewness in the data so that any systemic (i.e. non-random) biases can be detected, evaluated and, ultimately, eliminated.
 
Only when they can demonstrate that their data-set is valid and does conform to the Normal/Gaussian distribution parameters should they publish it as doing so. Otherwise, publishing it would be unethical, in my view.

Reply to  John Power
January 19, 2026 11:11 am

It seems unfortunate to me that the GUM has not made it more clear that this is not actually a definition of measurement uncertainty per se but is only a definition of a standard metric for uncertainty that can be applied generically to data that conform to the Normal/Gaussian distribution but not necessarily to other distributions.”

What the GUM appears to try to say is that the “form” of a standard deviation is how the uncertainty should be expressed. But you are correct, this gets confusing because most of what they show in the at least the first part of the GUM assumes normal distributions.

Skewness is not just from systematic effects. Therefore it defies analysis using statistical analysis. Consider, cold temperatures have a larger variance than warm temperatures. Climate science tries to get around this with the excuse that “anomalies fix the problem”. Only anomalies do *NOT* fix the problem. Anomalies inherit the very same variances as exist in the parent distributions. Linear transformations with a constant, i.e. subtracting a baseline temperature as a constant value, does not change the standard deviation (ne variance) of the resulting distribution. So when you jam anomalies together from warm and cold temperatures you wind up with a data set that is highly unlikely to be Gaussian.

Climate science just goes blithely along because they don’t worry about variances – which are a metric for measurement uncertainty – because they assume all measurement uncertainty just cancels out, i.e. all the data is 100% accurate.

Climate science is a metrology nightmare from the very base to the top of the heirarchy. I’ve even been given the excuse that measurement uncertainty is insignificant compared to sampling uncertainty so it can be ignored. If the sampling uncertainty is so large that it is greater than the measurement uncertainty then that excuse *should* be damning as far as fit-for-purpose is concerned.


Michael Flynn
January 9, 2026 4:57 pm

Observing the outputs of a chaotic system like the atmosphere is preferable to observing dancing naked ladies if that’s what turns you on.

Absolutely nothing of use to humanity comes from either activity. Personal gratification only.

January 9, 2026 5:59 pm

No uncertainty bounds. Add those, and there’s not much to choose from.

Reply to  Pat Frank
January 9, 2026 10:55 pm

Each model has its own uncertainty range that isn’t posted at all.

antigtiff
January 9, 2026 6:15 pm

Wanna bet? The Predictive Markets are going to include bets on climate change.

D Sandberg
January 9, 2026 7:47 pm

 
IPCC vs Low Climate Sensitivity: Why Their Warming Projections Are 3X Higher

When I started digging into climate sensitivity numbers, I assumed the IPCC was exaggerating by about 2X. Turns out, it’s closer to 3X. Here’s the data that shocked me.

Summary

The IPCC bases its projections on an Equilibrium Climate Sensitivity (ECS) midpoint range of 1.85–4.5°C per CO₂ doubling. Several prominent scientists—Lindzen, Happer, Curry, and Spencer—argue for much lower ECS values, averaging ~1.24°C. Using the same IPCC scenarios, here’s how the numbers compare:

Projected Warming by 2100 (Relative to Pre-Industrial)

(Equal-weight ECS values: 0.70, 0.71, 1.65, 1.90°C; midpoint = 1.24°C)

Scenario          | IPCC Range  | IPCC Midpoint | Low-ECS Midpoint
SSP1-2.6 (Low)          | 1.3–2.4°C   | 1.85°C        | ~0.6°C
SSP2-4.5 (Medium)  | 2.1–3.5°C   | 2.8°C          | ~0.9°C
SSP3-7.0 (High)         | 2.8–4.6°C   | 3.7°C          | ~1.2°C
SSP5-8.5 (Extreme) | 3.3–5.7°C   | 4.5°C          | ~1.4°C

Conclusion

Under low ECS assumptions, even the most extreme scenario (SSP5-8.5) produces ~1.4°C warming, compared to IPCC’s 4.5°C midpoint. That’s not just a small difference—it’s a threefold gap. This raises serious questions about the basis for catastrophic climate projections.

January 10, 2026 1:39 am

So once again, the same old question. Why do we not just throw out all the models which are falsified by observation, and just use the ones with a good track record?

And what is the point of averaging all the failures with the one or two good ones and then producing a number which is said to be somehow better than doing this?

Imagine this were some other area of science where important public policy decisions are being taken. Like for instance whether to license a given new treatment, and we are trying to tell how successful it is. We have a bunch of theories of why it should work and they deliver projections. Most of them way overstate the effectivness and safety when compared to clinical observations.

Do we make an average of them all? And in the climate case, what, anyway, is the basis of inclusion of a model in the ensemble? Its not that the ones at the far left have any proven validity to justify inclusion. Is it just that some friends of ours wrote it so we don’t want to be un-collegial?

Mr.
Reply to  michel
January 10, 2026 6:25 am

Yes, “going along to get along” seems to be a feature in climate science.

Nothing has changed since the Climategate scandal.
If you’re not with us, you’re against us.

Reply to  michel
January 10, 2026 6:57 am

Why do we not just throw out all the models which are falsified by observation, and just use the ones with a good track record?”

Pat Frank came up with a simple linear equation that matches the ensemble of models. it would be simple to extend that to match the output of the most accurate model.

It’s obvious that the models, while claiming to be representations of the biosphere based on physics, are nothing more that data matching algorithms extended to give future projections. You can make a data matching algorithm as complicated as you want with all kinds of differential equations but, in the end, it is still just a data matching algorithm. AND IS USELESS FOR PREDICTING THE FUTURE IN A CHAOTIC SYSTEM LIKE THE EARTH. Couple that with climate science’s refusal to use basic metrology concepts in propagating the measurement uncertainty of the components in the model and they just aren’t believable as physical science.

roywspencer
January 11, 2026 12:51 pm

It has amazed me that climate models are supposedly built upon “known physical processes”, yet they range over a fact of 3 in equilibrium climate sensitivity. How accurate would they have been if there was no knowledge of how much warming has occurred in the last, say, 50 years? After all, if they are just based upon “known physics”, the modelers would never have to look at observed temperature trends to make forecasts/hindcasts.

KB
Reply to  roywspencer
January 11, 2026 1:20 pm

I thought they accounted for known physical processes and whatever is left over is ascribed to CO2? Hence the wide variation in ECS.

In the plot, we get the results only in degrees per decade. But we don’t know how much of that forecast increase comes from CO2 alone. What we need to see are the climate sensitivities to CO2 that they are using.

For all we know, the most successful models have the lowest sensitivities to CO2, there is no way of telling from what we are told here ?

Reply to  roywspencer
January 11, 2026 4:04 pm

There are lots of things about models that cause one to be sceptical about their predictions. “Tuning” with unknown values is one. If values are known, use them. If they aren’t known, and are critical, guessing only raises doubts about the whole process. Lastly, from the graphs I have seen, there are no noticeable pauses or even cooling periods. There are only inexorable increases that go along with increasing CO2.