Nic Lewis
Why matching of CMIP5 model-simulated to observed warming does not indicate model skill
A well-known Dutch journalist, Maarten Keulemans of De Volkskrant, recently tweeted an open letter to the Nobel-prizewinning physicist Professor Clauser in response to his signing of the Clintel World Climate Declaration that “There is no climate emergency”, asking for his response to various questions. One of these was:
The CLINTEL Declaration states that the world has warmed “significantly less than predicted by (the) IPCC”. Yet, a simple check of the models versus observed warming demonstrates that “climate models published since 1973 have generally been quite skillful predicting future warming”, as Zeke Hausfather’s team at Berkeley Earth recently analysed.
The most recent such analysis appears to be that shown for CMIP5 models in a tweet by Zeke Hausfather, reproduced in Figure 1. While the agreement between modeled and observed global mean surface temperature (GMST) warming over 1970–2020 shown in the Figure 1 looks impressive, it is perhaps unsurprising given that modelers knew when developing and tuning their models what the observed warming had been over most of this period.
Figure 1. Zeke Hausfather’s comparison of global surface temperature warming in CMIP5 climate models with observational records. Simulations based on the intermediate mitigation RCP4.5 scenario of global human influence on ERF through emissions of greenhouse gases, etc. were used to extend the CMIP5 Historical simulations beyond 2005.
It is well-known that climate models have a higher climate sensitivity than observations indicate. Figure 2 compares equilibrium climate sensitivity (ECS) diagnosed in CMIP5 models and in the latest generation, CMIP6, models with the corresponding observational estimate on the same basis in Lewis (2022) of 2.16°C and (likely range 1.75–2.7°C). Only one model has an ECS below the estimate in Lewis (2022), and most models have ECS values exceeding the upper bound of its likely range. CMIP6 models are generally even more sensitive than CMIP5 models, with half of them having ECS values above the top of the 2.5–4°C likely range given in the IPCC’s 2021 Sixth Assessment Report: The Physical Science Basis (AR6 WG1).
Figure 2. Red bars: equilibrium climate sensitivity in CMIP5 and CMIP6 models per Zelinka et al. (2020) Tables S1 & S2 estimated by the standard method (ordinary least squares regression over years 1–150 of abrupt4xCO2 simulations). Blue line and blue shaded band: best estimate and likely (17%-83% probability) range for ECS in Lewis (2022), derived from observational evidence over the ~150 year historical period but adjusted to correspond to that estimated using the aforementioned standard method for models.
So, how is it possible that Hausfather gets an apparently good match between models and observations in the period 1970-2020? Does it imply that the models correctly represent the effects of changes in “climate forcers”, such as the atmospheric concentration of greenhouse gases and aerosols, on GMST, and accordingly that their climate sensitivities are correct?
The key question is this. Matching by CMIP5 climate models, in aggregate, with observed GMST changes would only be evidence that models correctly represent the effects of changes in “climate forcers”, such as the atmospheric concentration of greenhouse gases and aerosols, on GMST if resulting changes in their combined strength in models matched best estimates of the actual changes in those forcers. The standard measure of strength of changes in climate forcers, in terms of their effect on GMST, is their “effective radiative forcing” (ERF), which measures the effect on global radiative flux at the top of the Earth’s atmosphere once it and the land surface have adjusted to the changes in climate forcers (see IPCC AR6 WG1 Chapter 7, section 7.3)
It is therefore important to compare changes in total ERF as diagnosed in CMIP5 models during their Historical and RCP4.5 scenario simulations over 1970–2020 with the current best estimates of their actual changes, which I will take to be those per IPCC AR6 WG1 Annex III, extended from 2019 to 2020 using the almost identical Climate Indicator Project ERF time series.
Historical and RCP4.5 ERF (referred to as “adjusted forcing”) in CMIP5 models was diagnosed in Forster at al. (2013), for the 20 models with the necessary data. I take the mean ERF for that ensemble of models[1] as representing the ERF in the CMIP5 models used in Figure 1.
Figure 3 compares the foregoing estimates of mean ERF in CMIP5 models with the best estimates given in IPCC AR6. Between the early 1980s and the late 2000s CMIP5 and AR6 ERF estimates agreed quite closely, but they diverged both before and (particularly) after that period. The main reason for their divergence since 2007 appears to be that aerosol ERF, which is negative, is now estimated to have become much smaller over that period than was projected under the RCP4.5 scenario. Updated estimates of aerosol ERF also appears likely to account for about half of their lesser divergence prior to 1983, with the remainder mainly attributable to differences in ERF changes for land use and various other forcing agents.
Figure 3. Effective radiative forcing (ERF) over 1970–2020 as estimated in CMIP5 models (mean across 19 models) and the best estimate given in the IPCC Sixth Assessment Scientific Report (AR6 WG1). The ERF values are relative to their 1860–79 means.
The IPCC AR6 best estimate of the actual ERF change between 1970 and 2020 is 2.53 Wm−2. The linear trend change over 1970–2020 given by ordinary least squares regression is 2.66 Wm−2, while the change between the means of the first and last decades in the period, scaled to the full 50 year period, is 2.59 Wm−2.
By comparison, the mean ERF change for CMIP5 models between 1970 and 2020 is 1.67 Wm−2. The linear trend change over 1970–2020 is 1.92 Wm−2, and the scaled change between the first to last decades’ means is 1.76 Wm−2.
It is evident that the AR6 estimate of the actual 1970–2020 ERF change is far greater than that in CMIP5 models. Based on the single years 1970 and 2020, the AR6-to-CMIP5 model ERF change ratio is 1.51. Based on linear trends that ratio is 1.39, while based on first and last decades’ means it is 1.46. The last of these measures is arguably the most reliable, since single year ERF estimates may be somewhat unrepresentative, and due to intermittent volcanism the ERF has large deviations from a linear relationship to time. As there is some uncertainty I will take the ratio as being in the range 1.4 to 1.5.
So, CMIP5 models matched the observed 1970–2020 warming trend, but the estimated actual change in ERF was 1.4 to 1.5 times greater than that in CMIP5 models. On the assumption that both the CMIP5 model ERF estimates and the IPCC AR6 best estimates of ERFs are accurate, it follows that:
- CMIP5 models are on average 1.4 to 1.5 times as sensitive as the real climate system was to greenhouse gas and other forcings over 1970–2020[2]; and
- CMIP5 models would have over-warmed by 40–50% if their ERF change over that period had been in line with reality.
It seems clear that the ERF change in CMIP5 models over 1970–2020 was substantially less than the IPCC AR6 best estimate, and that CMIP5 models substantially overestimated the sensitivity of the climate system during that period to changes in ERF. Moreover, the divergence is increasing: the ratio of AR6 to CMIP5 model ERF changes is slightly higher if the comparison is extended to 2022.
In conclusion, Maarten Keulemans’ claim that “a simple check of the models versus observed warming demonstrates that “climate models published since 1973 have generally been quite skillful predicting future warming” is false.
Contrary to the impression given by Zeke Hausfather’s rather misleading graph, CMIP5 models have not been at all skillful in predicting future warming; they have matched the illustrated 1970–2020 observed warming (which was past rather than future warming until the late 2000s, when CMIP5 models were still being tuned) due to their over-sensitivity being cancelled out by their use of ERF that increased much less than the IPCC’s latest best estimates of the actual ERF increase.
Nic Lewis 5 September 2023
[1] ex FGOALS-s2, the Historical and RCP simulations of which were subsequently withdrawn from the CMIP5 archive.
[2] There are some caveats to the conclusion that CMIP5 models were oversensitive by a factor of 1.4 to 1.5 times:
- the ensemble of CMIP5 models used in Forster et al. (2013) might not have been a representative subset of the entire set of CMIP5 model. However, there appears to be little or no evidence suggesting that is the case;
- despite their careful compilation, the AR6 best estimates of the evolution of ERF might be inaccurate;
- the CMIP5 model forcings derived by Forster et al. (2013) might be inaccurate. There are reasons to suspect that their method might produce ERF estimates that are up to about 10% lower than the methods used for IPCC AR6. However, Forster et al. present some evidence in favour of the accuracy of their method. Moreover, the agreement in Figure 2 between the CMIP5 and AR6 ERF time series between 1983 and 2007 (with divergences before and after then largely attributed to differences in particular forcing agents) is further evidence suggesting that the Forster et al. (2013) CMIP5 ERF estimates are fairly accurate; and
- due to the heat capacity of the ocean mixed layer, GMST is more closely related to average ERF exponentially-decayed over a few years rather than to ERF in the same year. Using exponentially-decayed ERFs would somewhat reduce the 1.4 low end estimate given above for the ratio of AR6 to CMIP5 model ERF 1970–2020 increase estimates, perhaps by ~10%.



The concepts of radiative forcing, feedbacks and climate sensitivity used in the climate models are pseudoscientific nonsense. An IR ‘greenhouse gas forcing’ does not change the energy balance of the earth, nor does it produce a measurable change in surface temperature. The climate models are empirically ‘tuned’ to match the mathematical construct of a global mean temperature record. One set of meaningless numbers has been ‘tuned’ to match another. Physical reality has been abandoned in favor of mathematical simplicity.
Climate model ‘tuning’ can be traced back at least to the 1981 paper by Hansen et al, specifically Figure 5 of that paper. Here, a combination of an increase in CO2 concentration, changes in solar flux and ‘volcanic aerosols’ were used to tune a 1-D RC model to match a global mean temperature record. The model was also ‘tuned’ using ‘water vapor feedback’ to get a climate sensitivity of 2.8 °C. This provided the pseudoscientific foundation for all of the later climate models. There are at least nine fundamental scientific errors in this paper – that have been copied by about 50 modeling groups. The fraudulent concept of radiative forcing has been used by the IPCC since it was founded in 1988. This history was reviewed by Ramaswamy et al [2019].
The errors in the Hansen 81 paper are:
1) An IR radiative forcing (increase in atmospheric greenhouse gas concentration) does not change the energy balance of the earth. Any slight increase in heat content in the troposphere is just reradiated back to space by wideband LWIR emission.
2) There is no greenhouse effect temperature.
3) The LWIR flux is coupled to the turbulent convection in the tropospheric heat engine.
4) The mathematical artifacts created by the Manabe and Wetherald 1967 modeling assumptions are accepted without question as real surface temperature changes (see MW67 P. 242).
5) The surface energy transfer processes, in particular the coupling of the LWIR flux to the wind driven latent heat flux are ignored in their ‘slab’ ocean model.
6) The discussion of radiative perturbations to the 1-D RC model has nothing to do with the earth’s climate.
7) The role of the ocean oscillations, particularly the Atlantic Multi-decadal Oscillation (AMO) in setting the global mean temperature is ignored.
8) Any increase in surface temperature from a ‘CO2 doubling’ is too small to measure.
9) A contrived set of ‘radiative forcings’ is used to ‘tune’ the 1-D RC model so that the output artifacts appear to match the global mean temperature series.
The basic climate issue that needs to be addressed is as follows:
Since the start of the Industrial Revolution about 200 years ago, the atmospheric concentration of CO2 has increased by approximately 140 parts per million (ppm), from 280 to 420 ppm. Radiative transfer calculations show that this has produced a decrease near 2 W m-2 in the longwave IR (LWIR) flux emitted to space at the top of the atmosphere (TOA) within the spectral range of the CO2 emission bands. There has also been a similar increase in the downward LWIR flux from the lower troposphere to the surface. For a ‘CO2 doubling’ from 280 to 560 ppm, the decrease in outgoing longwave radiation (OLR) is estimated to be 3.7 W m-2. At present, the average annual increase in CO2 concentration is near 2.4 ppm. This produces an increase in the downward LWIR flux to the surface of approximately 0.034 W m-2 per year. How do these changes in LWIR flux alter the surface temperature of the earth?
The short answer is that thermal engineering calculations of the change in surface temperature using the time dependent flux terms coupled to the surface thermal reservoir show that any CO2 induced change in surface temperature is ‘too small to measure’. The whole concept of radiative forcings, feedbacks and climate sensitivity as discussed in Chapter 7 of the AR6 Working Group 1 Report is pseudoscientific nonsense.
There are five parts to the engineering analysis.
1) The radiative transfer calculation of the change in LWIR flux at the top of the atmosphere (TOA) is incomplete. It has to be extended to include the change in the rate of cooling of the troposphere. When this is done, the maximum change for a ‘CO2 doubling’ is a decrease in the rate of cooling, or a slight warming of +0.08 K per day. At a lapse rate of -6.5 K km-1 an increase in temperature of +0.08 K is produced by a decrease in altitude of about 12 meters. This is equivalent to riding an elevator down four floors.
2) The upward and downward LWIR flux terms are decoupled by molecular line broadening. Almost all of the downward LWIR flux to the surface originates from within the first 2 km layer of the troposphere. Approximately half of this flux originates from the first 100 meter layer above the surface. This means that the small amount of tropospheric heating produced by a ‘greenhouse gas forcing’ is simply re-radiated to space as wideband LWIR emission (there may also be a change in altitude and therefore gravitational potential). THERE IS NO CHANGE TO THE ENERGY BALANCE OF THE EARTH. (The changes in cooling rates in the stratosphere require very small changes in flux because of the low air density).
3) At the surface, the penetration depth of the LWIR flux into the oceans is less than 100 micron (0.004 inches). Here it is fully coupled to the much larger and more variable wind driven evaporation (latent heat flux). Using long term zonal averages, the sensitivity of the latent heat flux to the wind speed within the ±30° latitude bands is at least 15 W m-2/m s-1. The 2 W m-2 increase in downward LWIR flux to the surface from 140 ppm CO2 is dissipated by an increase in wind speed of 13 centimeters per second. The annual increase of 0.034 W m-2 from 2.4 ppm CO2 is dissipated by an increase in wind speed of 2 mm s-1. Any CO2 induced ocean temperature changes are too small to measure.
4) Over land, all of the flux terms are absorbed by a thin surface layer. The surface temperature initially increases after sunrise as the solar flux is absorbed. This establishes a thermal gradient with both the cooler air above and the subsurface ground layers below. The surface-air gradient drives the evapotranspiration and the subsurface gradient conducts heat below the surface during the first part of the day after sunrise. Later in the day, as the surface cools, the subsurface gradient reverses and the stored heat is returned to the surface. As the land and air temperatures equalize in the evening, the convection stops and the surface cools more slowly by net LWIR emission. This convection transition temperature is reset each day by the local weather system passing through. Almost all of the absorbed solar heat is dissipated within the same diurnal cycle. The day to day changes in convection transition temperature are much larger than any temperature change produced by CO2.
5) When the global climate anomaly record, such as the HadCRUT4 data set is evaluated, the dominant term is found to be the Atlantic Multi-decadal Oscillation (AMO). The additional part of the recent warming may be explained as a combination of three factors. First there are urban heat islands related to population growth that were not part of the earlier record. Second, the mix of urban and rural weather stations use to create the global record has changed. Third, there are so called ‘homogenization’ adjustments that have been made to the raw temperature data. These include the ‘infilling’ of missing data and adjustments to correct for ‘bias’ related to changes in weather station location and instrumentation. It has been estimated that half of the warming in the ‘global record’ has been created by such adjustments.
The climate models are based on an invalid correlation between a contrived set of radiative forcings and an equally contrived ‘global average temperature’. Such models have no predictive capabilities over climate time scales because of Lorenz instabilities. The solutions to the large number of coupled non-linear equations is unstable and the errors increase over time. The models are simply ‘tuned’ to match the global average temperature record. Simple inspection of such records reveals the 1940 AMO peak. The IPCC climate fraud then continues by separating the contrived radiative forcings into ‘human’ and ‘natural’ factors. The models are then rerun with just the natural factors and this is used to ‘attribute’ climate change to ‘human’ or ‘anthropogenic’ causes. This is illustrated in Figure 1 using illustrations and data from IPCC AR6 WG1.
A more detailed discussion of climate energy transfer is provided in the recent book ‘Finding Simplicity in a Complex World – The Role of the Diurnal Temperature Cycle in Climate energy Transfer and Climate Change’ by Roy Clark and Arthur Rörsch. A summary and selected abstracts including references relevant to this discussion are available at http://www.clarkrorschpublication.com.
More information on climate pseudoscience is available at http://www.venturaphotonics.com.
.pdf summaries can be downloaded using the links:
https://venturaphotonics.com/files/VPCP_025.1_GreenhouseGasForcings.pdf
https://venturaphotonics.com/files/VPCP_026.1_TheCorruptionofClimateScience.pdf
Figure 1: Understanding the IPCC climate fraud: a) Changes in radiative forcings since 1750, b) simulated temperature increases from 1750 to 2019, based on a), c) time dependence of the radiative forcings and d) time dependence of the temperature changes derived from c), e) ‘tuned’ temperature record using a contrived set of radiative forcings that appear to simulate the global mean temperature record, f) the separation of the contrived forcings to create fraudulent ‘human’ and ‘natural’ temperature records, g) the contributions of the AMO, UHI etc. to the global mean climate record, h) the [pseudoscientific] equilibrium climate sensitivity (ECS) estimated from the CMIP6 models (IPCC AR6, WG1, figures 7.6, 7.7, 2.10, 7.8, 3.4b and FAQ 3.1 Fig. 1, ECS data from Table 7.SM.5).
Patrick Frank demonstrates curve fitting models.
https://youtu.be/0-Ke9F0m_gw?si=ORNwZntfrKyg781V
More on “implausibly hot” models
https://www.sciencemag.org/news/2021/07/un-climate-panel-confronts-implausibly-hot-forecasts-future-warming
Here are two of Gavin Schmidt’s takes on hot CIMP6:
https://www.realclimate.org/index.php/archives/2021/12/making-predictions-with-the-cmip6-ensemble/
and a graph from his Twitter thread [IIRC]: note the subtle way it’s done. The color band is actually two colors. He arbitrarily lops-off all the “too hot” GCMs to take the average of the rest to make it look closer to the surface temps. [Seems misleading to me.]
So, according to the alarmists, “the GCMs are running too hot, but we know them to be valid anyway. Trust us! “
And remember, that black line is a fabrication of agenda adjusted urban and airport temperatures.
There is absolutely no possibility of it being remotely representative of global temperatures.
Could someone smarter than I (most of you) tell me if these models use fudge factors throughout the entire period they are emulating in order to match reality?
If they do, then the model parameters at the start would be different to those at the end
Also, if they do, it would be interesting to see the end result using the last set of parameters but starting at the first year with no further ‘adjustments’.
Yes, yes and probably not. The fudge factors are necessary to make the model fit the historical data but, because they change to fit the data, I don’t think they are then used for the projections. The ecs used is about twice the amount of many modern estimates so the models immediately start to run hot, diverging from reality fairly rapidly. If you tried the experiment you suggest you’d just get a ski slope extending upwards from the start date – not very interesting or illuminating really.
Thanks John and Richard. This is interesting to me. Why bother with hindcasting if the projections are based on different physics?
Hindcasting is used to sell the model as an accurate forecasting tool. The use of ‘fudge factors’ and specific tuning isn’t explicitly stated; the modellers are trying to fool the onlookers into believing that the model is accurate whilst trying to hide just by how much they’ve had to manipulate the data to make it fit.
Any model that fails to deliver the expected result will be modified. Any model that delivers the expected result will remain unchanged.
In Decision Support Systems, you generally accept that you use a model once. The tendency is to hang on to it because it worked, then start adding fine tuning as you go forward. In that, all you have done is curve fitting and wasted everyone’s time and money. When the model is wrong, you don’t fine tune it. You trash it and try to build another one with what you have learned. But if you inject your religion into it, you are just making things up.
The models have become the cornerstone of the new religion – to question their accuracy is to question holy writ, the very words of the Climate God.
I’m sure Nic’s explanation is meaningful but I don’t understand it. Hoping someone can help me.
My understanding is that RCP (representative concentration pathway) is an indication of how much earth’s average global temperature will rise over the average global temperature of the mid to late 1800s. If I remember right there were four pathways 2.6, 4.5, 6.0 and 8.5. My understanding was that the 8.5 scenario was business as usual with CO2 emissions steadily rising, 6.0 was some effort to reduce CO2 emissions but CO2 still steadily rising, 4.5 was substantial efforts to reduce CO2 emissions where we would see a leveling off and some reduction in the future and 2.6 would be an all out serious reduction of CO2 emissions, an effort that would keep average global temperature increases below the 1.5-2.0C level.
If all of this is true then using the 4.5 RCP and claiming it matches observations is not honest. Although Europe and the US have made efforts to lower CO2 emissions the rest of the world has not, China and India are instead massively increasing emissions. Consequently earth’s total CO2 has been increasing rather than plateauing or decreasing. Therefore using the 4.5 RCP is inappropriate.
It’s a bit complicated, but the RCPs are indeed Representative Concentration Pathways.
Each concentration pathway more or less ties in to different emissions scenarios.
As I understand it, the emissions and concentrations were assumed to map pretty much as you summarised.
Those assumptions seriously underestimated natural CO2 sinks (or overestimated CO2 residence), with the end result that subsequent emission rates give RCP 4.5 concentrations rather than RCP 8.5.
bdgwx has posted charts showing Hansen and IPCC projections vs observations, which are quite useful. Perhaps he can do the same in response to your comment.
Most current “Western” policies seem to be based on the emissions rather than concentrations.
Almost, but not quite …
This is one of the very few occasions where the AR6 assessment report from Working Group Three (WG-III) actually contains something “useful / interesting”.
Annex III is titled “Scenarios and modelling methods”, and can be downloaded as a PDF file from a link near the bottom of the following webpage.
URL : https://www.ipcc.ch/report/ar6/wg3/
From your post I think the sections the most … “helpful” (?) when it comes to answering your current set of questions are A.III.II.1.3.1, “History of scenario frameworks used by the IPCC”, and A.III.II.1.3.2, “Current scenario framework and SSP-based emission scenarios”, on pages 1872 and 1873 respectively.
From the end of section A.III.II.1.3.1 :
Notes
– The RCPs defined emission inputs to the (CMIP5) climate models that produced, on average, set targets of radiative forcing in 2100. GMST is related to RF, but they are different things.
– For the CMIP5 (/ AR5) modelling round the IPCC didn’t bother with how “feasible”, either technically or politically, a given target was. They just plugged the “GHG emissions levels” needed to reach the chosen “How much RF in 2100 ?” number into the model input files.
It was only with AR6 that the “socio-economic” aspects of the modelling process were (finally) included, as explained in section A.III.II.1.3.2 :
AR6 (WG-III) also assessed the “feasibility” of each “SSP + Radiative forcing in 2100” combination, as summarised in Annex III, Figure 4 (on page 1874, copied below).
Note that they got “feasible” combinations for SSP5 all the way down to 1.9 W/m² (in 2100).
NB : This post only partially answers your questions.
In the words of “old cocky”, if you’re looking for a complete response “It’s a bit complicated” …
Thanks Mark, I am aware of the SSPs but my comment is specifically meant to point out that Zeke used a graph of RCP 4.5 to justify that IPCC models matched observations. My point is that if he is going to use an RCP scenario it should be the one that matches reality. The reality is that regardless of what Europe and the US have done CO2 emissions have not plateaued or retreated but have increased steadily. That is not the pathway that 4.5 represents. RCP 6.0 or 8.5 would better represent today’s reality and graphs of either of those two pathways would not match observations.
As so often in the domain of “climate science”, Mark Twain’s famous quote about “lies,damned lies and statistics” applies.
Looking at fossil-fuel (not “total, including LULUCF / AFOLU” …) CO2 (not “all greenhouse gases” …) emissions, then an argument can be made that “the RCP 4.5 emissions ‘pathway’ is indeed the closest one to current observations”, and may even continue to be the closest up to (at least ?) 2030.
Each successive round of misleading curve fitting makes it more difficult for a genuine model to compete. If actual error is reasonably large, anyone who tries to develop an honest physical model will be destroyed by mathematicians with computers. I would be so upset if I were on a clean team – lost in obscurity like names that aren’t famous because of Barry Bonds and Lance Armstrong.
The match between prediction and actual shown in Fig. 1 is so good as to be totally unbelievable. The explanation is obvious. The computer runs are probably quite recent, so the modellers knew the answer already before doing the runs. Any fool can predict the past.
A modeller called it a “dirty secret”: the models are filled with “parameterisation” adjustments – otherwise known as fudge factors – so it’s easy to match the models against past observations by adjusting the fudge factors. Because of this, using adjusted model runs when the answer was already known to “prove” CO2 climate change is fraudulent.
The only way to assess a model’s prediction success is to compare it with observed temperatures AFTER the date of the computer runs. So, to check the period 1970 to 2020, computer runs from 1970 should only be used. The resulting graph would be *very* different to Fig. 1!
John Christy has done this. It shows that the models have been exaggerating the warming by almost a factor of three. They have zero predictive skill. Like the Covid graphs of doom, their sole purpose is to instill fear. They have little to do with science.
Chris
Zeke Hausenpheifer is an ilk of Stokes — a paid dupe.
“The agreement between modeled and observed global mean surface temperature (GMST) warming over 1970–2020 shown in the Figure 1” does not look “impressive” to me. The range between the high and the low values is around 0.13. The average of the models looks to be around 0.53.
Who’s impressed by a precision of 0.53 +/- 12%?
It puts the unknown area at about +/- 0.06. Meaning it would be difficult to establish the difference between min and max to anything lower than the tenths digit. You certainty can’t establish a difference in the hundredths digit.
I instinctively actively avoid Twitter (/ now “X” ?), so didn’t actually check your link until today.
Nic’s “most recent” graph is almost three years old …
Real Climate, where Zeke Hausfather posted as “zeke” (the website is now basically mothballed, but guess who “mike” and “gavin” are / were …), updated their “analysis” in January of this year, which included the following for the CMIP5 modelling cycle.
URL : https://www.realclimate.org/index.php/climate-model-projections-compared-to-observations/
The link to RC’s “CMIP5 historical simulations (using the RCP4.5 projection post-2005)” PNG image file should “automagically” get inserted here …
NB : “B Zipperer” already posted a copy of RC’s CMIP6 “adjustments” graph, but I will include a link to the original at RC here as well to avoid having to scroll up and down this comments section …
Note that for CMIP5 (RCP 4.5 from 2006) the “forcing adjustments” had the effect of “bending” the entire range of the IPCC’s “projections”, while for CMIP6 (SSP2-4.5 from 2015) they just “filtered out” the evidence that “the most recent models are simply running too hot”.
PS : For the CMIP3 (SRES A1B from 2001) graph they decided no “adjustments” whatsoever were required …