By Andy May
The IPCC AR5 report was published in 2013 and the CMIP5 climate models they used, have been shown to predict faster warming than observed in the tropical troposphere at a statistically significant level by Ross McKitrick and John Christy.[1] This problem is acknowledged and discussed in the latest AR6 report, published in 2021, but brushed aside as unimportant. In AR6, the IPCC observed:
“The AR5 assessed with low confidence that most, though not all, CMIP3 and CMIP5 models overestimated the observed warming trend in the tropical troposphere during the satellite period 1979-2012, and that a third to a half of this difference was due to an overestimate of the SST [sea surface temperature] trend during this period. Since the AR5, additional studies based on CMIP5 and CMIP6 models show that this warming bias in tropospheric temperatures remains.”
(AR6, p. 3-23)

Figure 1 compares the overestimated warming in CMIP5 (AR5, right side of Figure 1) to the overestimated warming in CMIP6 (AR6, left side of Figure 1). The problem doesn’t “just remain,” it got worse. Notice the scale change in Figure 1, the AR6 scale goes over 0.6°C/decade and the AR5 scale tops out at 0.5°C/decade. In AR6 we see the average and full range of 60 models with modelled SSTs (sea surface temperatures) in red and 46 models forced to use observed SSTs in blue. The models cannot even get it right when they know what the SST is, suggesting that the models have the sensitivity to greenhouse gases wrong, or they are missing some critical climate component. Remember, the models assume that the Sun is invariant, except for the ~11-year solar cycle, and that natural variability has no pattern. Natural variability is modeled as random noise with a mean of zero climate effect.
The two graphs in Figure 1 have slightly different time periods and the observation datasets are slightly different as are the areas covered, but both are as internally consistent as possible. That is, the area covered by the observation datasets, is the same area covered by the models. I refer you to AR5, AR6, and Dann Mitchell and colleagues 2020 paper[2] for the details, or to my new book.[3]
The AR5 profile, on the right in Figure 1, colors the 5% to 95% confidence intervals of the modeled components of warming. Blue is modeled natural warming, although the blue band is too narrow, it actually extends to the red, all forcings band. The green band is the modeled greenhouse gas warming. In both figures, the observations are completely below the modeled anthropogenic warming from an altitude of 300 hPa (~30,000 feet, 9 km) to 150 hPa (~44,000 feet, 13.5 km). Most of the observations fall in the modeled “natural forcings” range, suggesting that the models overestimate greenhouse gas warming in this critical part of the atmosphere, or that the anthropogenic greenhouse warming has no significant effect.
Since 1975, when Manabe and Weatherald published their landmark climate modeling paper:[4]
“…climate models have consistently shown greater warming in the upper tropical troposphere than near the surface due to increased CO2 concentrations.”[5]
(Mitchell, Lo, Seviour, Haimberger, & Polvani, 2020)
CMIP3 models were used in the AR4 report, and when Dann Mitchell and colleagues analyzed the models, they found the models generated much higher surface temperatures than observed. Further, they found that when they used atmosphere-only models and forced the surface temperature to match observations, the overheating of the troposphere was reduced, but the temperature trend in the troposphere was still too high.[6] Basically their work shows that greenhouse gas warming is overestimated.
Mitchell’s 2013 paper[7] contains the following humorous sentence:
“The observed temperature record is one single realization of many possible realizations that could have emerged given internal climate variability.”
Mitchell, et al. (2013)
It’s not a realization Dann, it’s reality. He’s trying to say that due to possible measurement errors and known or unknown long-term natural climate variability, we are trying to model a moving target. This is true, of course, but it is what it is, and when comparing a model to reality, the differences are errors in the models, not in the measurements. Classic model-speak, I’ve said similar stupid things in my past petrophysical modeling life. Of course, there are errors in observations, and the measurement error can be estimated, but we do not know what the natural variability is, or if it is random over relevant time frames, as the modelers assume. The measurements are what they are, and our models must match them very closely if they are to be believed.
CMIP6 was published in 2021, thus, the statistically significant problem illustrated in Figure 1, has persisted for at least 46 years, and it is worse in 2021 than it was in 1975. We have spent billions of dollars and thousands, perhaps millions, of man-hours and the models are getting worse with time. Why?
The AR6 models are farther from observations than the AR5 models and are far less consistent with one another. From AR6, Chapter 7:
“On average, CMIP6 models have higher mean ECS and TCR values than the CMIP5 generation of models. They also have higher mean values and wider spreads than the assessed best estimates and very likely ranges within this Report. These higher ECS and TCR values can, in some models, be traced to changes in extra-tropical cloud feedbacks that have emerged from efforts to reduce biases in these clouds compared to satellite observations (medium confidence). The broader ECS and TCR ranges from CMIP6 also lead the models to project a range of future warming that is wider than the assessed warming range, which is based on multiple lines of evidence. However, some of the high-sensitivity CMIP6 models are less consistent with observed recent changes in global warming and with paleoclimate proxy data than models with ECS within the very likely range. Similarly, some of the low-sensitivity models are less consistent with the paleoclimate data. The CMIP models with the highest ECS and TCR values provide insights into high-risk, low-likelihood futures, which cannot be excluded based on currently-available evidence. (high confidence)”
(AR6, p. 7-8 to 7-9).
Translation from IPCC-speak: Our models have gotten worse since AR5, they also produce higher climate sensitivity to CO2 than our assumed best climate sensitivity assessment. The uncertainty in our projections of future warming has increased, and our models are not very consistent with observations or the geological past, but the models might be right anyway, so be worried. I think that pretty much captures the meaning of the quote above.
The conceptual origin of the left atmospheric profile in Figure 1 is a 2020 paper by Dann Mitchell and colleagues.[8] In that paper they present the summary graph we show as Figure 2.

Mitchell, et al. point out that the difference in warming rates at 200 hPa (12 km) is a factor of about four and the difference at 850 hPa (~1.5 km) is a factor of about two. The difference is even larger at 150 hPa (~13.5 km). These are not small differences, they are huge. Notice how small the spread in observed warming rates is and that there is no overlap between the models and the observations at 200 hPa. This means there is much less than a 5% chance the models are accurate.
Conclusions
The differences strongly suggest that the models are overestimating the importance of greenhouse gases in global warming and missing important natural influences. Not surprising since the models assume that natural forces are not contributing to recent warming. Responsible modelers would recognize they are on the wrong track and abandon the Manabe and Weatherald model framework and look elsewhere. Someone once said:
“Insanity is doing the same thing over and over again and expecting different results.”
Einstein perhaps, or someone else, regardless, it is true.
One would think after six major reports, and several minor reports, all clearly wrong in the critical tropics, the IPCC would fix the problem. But, even after all this work, they can’t. Perhaps the basic framework and assumptions they are using are wrong? Is it unreasonable to say that? I don’t think so.
One hint given in the Mitchell papers stands out. The dominant cooling mechanism in the tropics is convection, this is due to the high absolute humidity there. The tropics receives more solar radiation than it radiates to space, convection carries the excess energy toward the poles. Perhaps convection is modeled incorrectly in the models? Perhaps convective heat transport from the tropics to the poles is driving climate change and being overlooked? Just a thought.
The bulk of this post is an excerpt from my latest book, The Great Climate Debate: Karoly v Happer.
The bibliography can be downloaded here.
-
(McKitrick & Christy, 2018) ↑
-
(Mitchell, Lo, Seviour, Haimberger, & Polvani, 2020) ↑
-
(May, 2022) ↑
-
(Manabe & Wetherald, 1975) ↑
-
(Mitchell, Lo, Seviour, Haimberger, & Polvani, 2020) ↑
-
(Mitchell, Thorne, Stott, & Gray, 2013) ↑
-
(Mitchell, Thorne, Stott, & Gray, 2013) ↑
-
(Mitchell, Lo, Seviour, Haimberger, & Polvani, 2020) ↑
Andy,
The comment you quote from Dann Mitchell
“The observed temperature record is one single realization of many possible realizations that could have emerged given internal climate variability.”
I would point out firstly that in geostatistics the formal mathematical view of the world (ie observations) is that they are a single realisation of a stochastic process. So my initial thought is to interpret that statement in the same way. In other words, we are trying to infer the underlying statistical properties of a stochastic process which we can only observe from a single realisation – the real world observations. We assume the average properties of the observed realisation over time/space are the same as the statistical properties of the ensemble. This is the ergodicity assumption. Note the statistical inference of those properties also depends very strongly on the stationarity assumption (both in space and time).
However, there is another way you could interpret Dann’s quote. This is that due to non-uniqueness there are a whole series of possible temperature realisations that could have arisen from the same physical properties/intial state of the climate system. This would then be a rather abtuse argument that even though climate models don’t give the same result in terms of output temps they nonetheless represent the same climate system. Which would of course be unprovable but fits with the IPCC statement “the climate system is a coupled non-linear chaotic system” which they then go onto ignore and pretend doesn’t matter by using linear modelling.
That kind of reasoning leads to (a) no way of establishing statistical inference and therefore the ability to compare the modelled output to the real world ie a free pass to all modelling no matter how badly it matches; (b) no hope of ever matching model output to the observations and (c) madness.
ThinkingScientist,
The only way to validate a model is by comparison to reality. In this the models have failed. It is perfectly acceptable to include 95%-5% confidence limits to the measurements of the reality, but they still fail, even including that. See Figure 2.
Trying to include an estimate of the variability in “reality” assumes you understand natural variability, which clearly, we do not.
If the 95%-5% confidence limit is very large, the prediction may be of no practical utility. That is, if the nominal predicted value is 0.5, out of a physically possible range of 0.0 to 1.0, and the CI is +/-0.5, the prediction isn’t really telling you anything that you didn’t know.
Typically, unless the prediction gets one to within 10% of the actual value, it isn’t of much value. At 1%, it starts to approach ‘high precision,’ although that varies with the goal and the discipline. Six sigma, commonly achieved in physics experiments, is something climatologists can’t conceive. It isn’t even a ‘gleam in their eye,’ yet. 🙂
NB : My post responding to Andy May “above” (who was responding to Rud Istvan) contains some background information pertinent to this one.
In the AR6 WG1 report last September the IPCC claimed, in section 1.4.1 (“Baselines, reference periods and anomalies”, on page 1-54) that from now on :
I inferred from this (probably incorrectly ? …) that using 20-year trends is now OK for “climate change” investigations.
Assuming the “tropical hotspot” goes from 30°N to 30°S latitudes and from 100 hPa to 450 hPa “altitudes”, extracting area-weighted averages from the gridded arrays (with “month, altitude, latitude band” coordinates) of STAR, RATPAC-B and CMIP5 “taz” (from Climate Explorer, the equivalent for CMIP6 is not yet available) data, I generated the graph below of 20-year (240-month, trailing) trends.
This is still preliminary, but I think provides a much more “dynamic” look at how the hotspot evolves over time than looking at a single “snapshot” for a fixed (20 to 30 year long) time period.
NB : The “pre-processed” UAH and RSS “Tropics” timeseries data are for “20°N to 20°S” only.
Note also that for this subset of satellite data (mid-troposphere / TMT, 20°N-S or 30°N-S) both UAH and STAR TMT have zero trends for some 20(+) year periods.
RSS does not.
Follow up post …
RATPAC-A only comes in a “pre-processed” version for the “300-100 hPa” altitude layer, but does include separate “20N-S” and “Tropics [ = 30°N to 30°S … ]” columns.
Checking my (4 decimal place) extracted STAR “TMT, Global / 90°N to 90°S” area-weighted data against their (3dp) pre-processed “Global_Mean_Anomaly” timeseries gave values identical to +/- 0.0005, providing confidence the basic approach is “reasonable” for the STAR TMT “Tropics” options (20°N-S and 30°N-S), as well as being applicable to the CMIP5 “taz” dataset.
Well done, Mark! I’m still thinking about what you did, but it looks correct, and it seems to show there is no trend in the middle troposphere, suggesting a small and undetectable GHG influence on climate. I forwarded your comments to John Christy.
As I wrote this is still at the “preliminary” … read “throw all datasets onto a single graph and look at it for a while” … stage.
I’m still “thinking about” the best way to move forward on this !
I’m genuinely flattered, but he’s at the “think about” 80+ individual model runs in parallel level.
Cf Willis Eschenbach, Nick Stokes, Rud Istvan, …
I’m still struggling to pare down a usable summary from the “ensemble means” level.
Follow up post 2 …
After adding an ENSO proxy (ONI V5 here) to the anomalies data it seems “obvious” (?) that
1) The models are overly sensitive, and
2) The amplitude of the short-term “noise” influence from ENSO on the “hotspot” region is much, much, larger than any (purported) long-term “warming trend” added by GHG emissions
The polar vortex will continue to strongly influence the weather in North America and eastern Europe.
http://tropic.ssec.wisc.edu/real-time/mtpw2/webAnims/tpw_nrl_colors/namer/mimictpw_namer_latest.gif
Why will La Niña’s continue? Observations indicate that the warm subsurface wave will not reach the eastern Pacific before fall. As solar activity increases the polar vortex to the south should be strong, so there should be stronger latitudinal winds, thus maintaining the current circulation over the equator.
As evidence, SOI is currently on the rise.
https://www.longpaddock.qld.gov.au/soi/
The screwy part is that their models are getting closer to observations, but only because they keep adjusting the temperature record. Take out their temperature observation adjustments and they are much further from being correct.
Regarding “The differences strongly suggest that the models are overestimating the importance of greenhouse gases in global warming and missing important natural influences”: I partially agree. Where I see the modeling overestimation being is in positive feedbacks to warming caused by increase of greenhouse gases, especially the water vapor feedback. And, I see a cause, which is groupthink of ignoring multidecadal oscillations such as the Atlantic Multidecadal Oscillation. (Michael Mann is a big name in a recent movement of denial of the AMO.)
Multidecadal oscillations including AMO were mostly favoring global temperature upswing during the last 30 years of the hindcasts of the CMIP3, CMIP5 and CMIP6 models. These models were mostly tuned for optimizing hindcasting, especially the last 30 years of their hindcasting. I see this ignoring of multidecadal oscillations as causing these models to be tuned to show the warming during their hindcasts’ last 30 years as being caused by factors that the modelers did consider, which I see as causing the models to be tuned to have excessive positive feedback, especially excessive water vapor feedback which would cause the models to show forecasted warming exceeding what they forecasted. Models being tuned to have excessive degree of the water vapor feedback would show an exaggerated tropical upper troposphere warming hotspot.
If only the modelers considered multidecadal oscillations along with factors they did consider, then they would have tuned their parameterizations to have done mostly a better job. I expect that would have had the models (at least on average) indicating climate sensitivity around the 1.4-1.9 degrees C per 2xCO2 indicated by studies by Nic Lewis and by Nic Lewis & Judith Curry.
As for AR6 getting worse than AR5: I see this happening as a result of climate activists including scientists refusing to talk with or debate other scientists, along with climate activists claiming that IPCC was too moderate as of AR5, and the MSM from the New York Times leftward not using critical thinking on claims by climate activists including activist scientists such as Michael Mann. If only people would not group together into groupthink (which got worse when Usenet stopped being effective at getting debates happening), and allow actual scientific debate and actual political debate between scientists, then I expect scientists to get better at getting things right.