Guest Post by Willis Eschenbach
Back in 2007, in a paper published in GRL entitled “Twentieth century climate model response and climate sensitivity” Jeffrey Kiehl noted a curious paradox. All of the various different climate models operated by different groups were able to do a reasonable job of emulating the historical surface temperature record. In fact, much is made of this agreement by people like the IPCC. They claim it shows that the models are valid, physical based representations of reality.
Figure 1. Kiehl results, comparing climate sensitivity (ECS) and total forcing.
The paradox is that the models all report greatly varying climate sensitivities but they all give approximately the same answer … what’s up with that? Here’s how Kiehl described it in his paper:
[4] One curious aspect of this result is that it is also well known [Houghton et al., 2001] that the same models that agree in simulating the anomaly in surface air temperature differ significantly in their predicted climate sensitivity. The cited range in climate sensitivity from a wide collection of models is usually 1.5 to 4.5C for a doubling of CO2, where most global climate models used for climate change studies vary by at least a factor of two in equilibrium sensitivity.
[5] The question is: if climate models differ by a factor of 2 to 3 in their climate sensitivity, how can they all simulate the global temperature record with a reasonable degree of accuracy?
How can that be? The models have widely varying sensitivities … but they all are able to replicate the historical temperatures? How is that possible?
Not to give away the answer, but here’s the answer that Kiehl gives (emphasis mine):
It is found that the total anthropogenic forcing for a wide range of climate models differs by a factor of two and that the total forcing is inversely correlated to climate sensitivity.
This kinda makes sense, because if the total forcing is larger, you’ll have to shrink it more (smaller sensitivity) to end up with a temperature result that fits the historical record. However, Kiehl was not quite correct.
My own research in June of this year, reported in the post Climate Sensitivity Deconstructed, has shown that the critical factor is not the total forcing as Kiehl hypothesized. What I found was that the climate sensitivity of the models is emulated very accurately by a simple trend ratio—the trend of the forcing divided by the trend of the model output.
Figure 2. Lambda compared to the trend ratio. Red shows transient climate sensitivity (TCR) of four individual models plus one 19-model average. Dark blue shows the equilibrium climate sensitivity (ECS) of the same models. Light blue are the results of the forcing datasets applied to actual historical temperature datasets.
Note that Kiehl’s misidentification of the cause of the variations is understandable. First, the output of the models are all fairly similar to the historical temperature. This allowed Kiehl to ignore the model output, which simplifies the question, but it increases the inaccuracy. Second, the total forcing is an anomaly which starts at zero at the start of the historical reconstruction. As a result, the total forcing is somewhat proportional to the trend of the forcing. Again, however, this increases the inaccuracy. But as a first cut at solving the paradox, as well as being the first person to write about it, I give high marks to Dr. Kiehl.
Now, I probably shouldn’t have been surprised by the fact that the sensitivity as calculated by the models is nothing more than the trend ratio. After all, the canonical equation of the prevailing climate paradigm is that forcing is directly related to temperature by the climate sensitivity (lambda). In particular, they say:
Change In Temperature (∆T) = Climate Sensitivity (lambda) times Change In Forcing (∆F), or in short,
∆T = lambda ∆F
But of course, that implies that
lambda = ∆T / ∆F
And the right hand term, on average, is nothing but the ratio of the trends.
So we see that once we’ve decided what forcing dataset the model will use, and decided what historical dataset the output is supposed to match, at that point the climate sensitivity is baked in. We don’t even need the model to calculate it. It will be the trend ratio—the trend of the historical temperature dataset divided by the trend of the forcing dataset. It has to be, by definition.
This completely explains why, after years of better and better computer models, the models are able to hindcast the past in more detail and complexity … but they still don’t agree any better about the climate sensitivity.
The reason is that the climate sensitivity has nothing to do with the models, and everything to do with the trends of the inputs to the models (forcings) and outputs of the models (emulations of historical temperatures).
So to summarize, as Dr. Kiehl suspected, the variations in the climate sensitivity as reported by the models are due entirely to the differences in the trends of the forcings used by the various models as compared to the trends of their outputs.
Given all of that, I actually laughed out loud when I was perusing the latest United Nations Inter-Governmental Panel on Climate Change’s farrago of science, non-science, anti-science, and pseudo-science called the Fifth Assessment Report (AR5). Bear in mind that as the name implies, this is from a panel of governments, not a panel of scientists:
The model spread in equilibrium climate sensitivity ranges from 2.1°C to 4.7°C and is very similar to the assessment in the AR4. There is very high confidence that the primary factor contributing to the spread in equilibrium climate sensitivity continues to be the cloud feedback. This applies to both the modern climate and the last glacial maximum.
I laughed because crying is too depressing … they truly, truly don’t understand what they are doing. How can they have “very high confidence” (95%) that the cause is “cloud feedback”, when they admit they don’t even understand the effects of the clouds? Here’s what they say about the observations of clouds and their effects, much less the models of those observations:
• Substantial ambiguity and therefore low confidence remains in the observations of global-scale cloud variability and trends. {2.5.7}
• There is low confidence in an observed global-scale trend in drought or dryness (lack of rainfall), due to lack of direct observations, methodological uncertainties and choice and geographical inconsistencies in the trends. {2.6.2}
• There is low confidence that any reported long-term (centennial) changes in tropical cyclone characteristics are robust, after accounting for past changes in observing capabilities. {2.6.3}
I’ll tell you, I have “very low” confidence in their analysis of the confidence levels throughout the documents …
But in any case, no, dear Inter-Governmental folks, the spread in model sensitivity is not due to the admittedly poorly modeled effects of the clouds. In fact it has nothing to do with any of the inner workings of the models. Climate sensitivity is a function of the choice of forcings and desired output (historical temperature dataset), and not a lot else.
Given that level of lack of understanding on the part of the Inter-Governments, it’s gonna be a long uphill fight … but I got nothing better to do.
w.
PS—me, I think the whole concept of “climate sensitivity” is meaningless in the context of a naturally thermoregulated system such as the climate. In such a system, an increase in one area is counteracted by a decrease in another area or time frame. See my posts It’s Not About Feedback and Emergent Climate Phenomena for a discussion of these issues.

Removing the _assumed_ x3 cloud feedback amplification and exaggerated volcanic (aerosol) sensitivity would get rid of a lot of the divergence problem but would not solve it. Post 2000 would still not be flat enough.
There obviously at least one key variable that is missing from the models.
There finally seems to be a grudging acknowledgement in AR5 that the sun may actually affect climate.Though they are still trying to spin it as a post 1998 problem so that they don’t have to admit it was a significant component of the post 1960 warming too.
I thought ‘think of a number then produced data to support it’ was the standard way to work in climate ‘science’ so I cannot see how there is a problem here as they are merely following the ‘professional standards ‘ of their area .
Roger Cohen says: Different models need different values of aerosol cooling to offset the excess warming arising from different (but too high) climate sensitivities. This makes aerosols basically an adjustable parameter.
My volcano stack plots show that tropics are highly _insensitive_ to aerosol driven changes in radiative input. That would presumably apply to other changes, be it solar, AGW or other, that affect radiation flux in the tropics, assuming that they does not overwhelm the range of tropical regulation (which may be the case during glaciation/deglaciation).
http://climategrog.wordpress.com/?attachment_id=286 (follow links therein for others)
It appears that the tropics have a stabilising effect on extra-tropical regions which do show a greater sensitivity (though SST and ocean gyres). That’s my reading of those plots.
Land also warms twice as quick a ocean leading to ex-tropical NH being more affected than ex-tropical SH.
Probably trying to define one unique global “sensitivity” is not helping.
Willis – Thanks for the analysis. Your “at that point the climate sensitivity is baked in. We don’t even need the model to calculate it” is brilliant, and demonstrated by your Figure 2. So … the part of the historical record which is not understood is assigned to CO2 and Climate Sensitivity is the factor used to make it match. In other words, the whole thing is an Argument From Ignorance.
Following on from this, when the models are claimed to be accurate because they match the historical record, that is a circular argument.
No wonder we’re not impressed by the IPCC.
re http://wattsupwiththat.com/2013/10/01/dr-kiehls-paradox/#comment-1433586
[code]
T2 = T1 + lambda (F2 – F1) (1 – exp(-dt/tau)) + exp(-dt/tau) (T1 – T0)
(T2-T1) – exp(-dt/tau) (T1 – T0) = lambda (F2 – F1) (1 – exp(-dt/tau))
[/code]
It can be seen that the LHS is the (almost) the diff of successive temperature diffs, ie acceleration not simple difference. This is point Frank was making. There is a scaling factor on the right which is close to unity for long tau delay constants.
This is why the incorrect method was not too far off for dt/tau or around 6 or 7 in the “Eruption” post.
Hmm, not sure that was any clear using formatting.
T2 = T1 + lambda (F2 – F1) (1 – exp(-dt/tau)) + exp(-dt/tau) (T1 – T0)
(T2-T1) – exp(-dt/tau) (T1 – T0) = lambda (F2 – F1) (1 – exp(-dt/tau))
Greg Goodman says:
October 1, 2013 at 11:26 pm
Sensitivity is not meaningless Willis. Even accepting your tropical governor hypothesis which , as you know, I am quite supportive of, no regulator is perfect. Every regulation system will maintain the control variable within certain limits yet it needs an error signal to operate on.
Greg, you seem to assume that the chaotic system of chaotic systems has a ‘single linear sensitivity’ to all the varied inputs (forcings) and feedbacks this appears to be illogical. Can you explain why you think such a non linear chaotic system will have only one simple linear sensitivity?
Actually, the desired output is the sensitivity, and the forcings and climate record are both adjusted to produce it, however necessary.
“The present and future are certain, only the past is subject to change.” – Uncle Joe (per his Polish parodists).
This same error will be the cause of the clear curvature seen in figure 2 here. The curvature is the “acceleration” and you are trying to fit “speed”.
Beyond that, it’s not too surprising that the derived sensitivity depends on what you guess the forcing to be, to explain a certain data set. Is that stating anything but the obvious?
If one is to miss out a major forcing and imply the observed changes are due to another nearly insignificant “forcing” you will get a high sensitivity.
conversely, if your high sensitivity gets things badly wrong as soon as you step beyond the calibration period, it probably means you’ve missed a major forcing.
BIG clue for IPCC.
Mike Jonas: “Following on from this, when the models are claimed to be accurate because they match the historical record, that is a circular argument.”
Well actually they don’t match it that well at all. They are tuned to match 1960 onwards as well as possible. They rather ignore the fact that there’s a pretty poor match to anything earlier.
kadaka (KD Knoebel) on October 1, 2013 at 11:31 pm:
Thank you for the links. Hope others take advantage.
Steven Mosher says:
October 1, 2013 at 11:52 pm
It certainly seems to be a surprise to the IPCC folks …
w.
Steven Mosher says: “check the relationship between aerosol forcing ( a free knob) and the sensitivity of models.”
Which is precisely why I did the volcano stacks, having pointed out to Willis that inflated volcano forcing was the pillar which was propping exaggerated AGW sensitivity. You can’t have one without the other. Even inflating both only works when both are present which is why post-Pinatubo period disproves the IPCC paradigm.
“It certainly seems to be a surprise to the IPCC folks …”
What surprised them was the lack of volcanoes. There were counting on a least on decent eruption per decade to keep the game in play.
Willis:
Thankyou for this essay.
You say
True, you have shown that and (as e.g. demonstrated by comments from Greg Goodman in this thread) it really annoys some people. But “the trend of the model output” is adjusted to fit by an arbitrary input of aerosol negative forcing.
The fact that – as Kiehl showed – each model is compensated by a different value of aerosol negative forcing demonstrates that at most only one of the models emulates the climate of the real Earth. And the fact that they each have too high a trend of model output without that compensation strongly suggests none of the models emulates the climate of the real Earth.
Simply, the climate models are wrong in principle. They need to be scrapped and done over.
I have often explained this on WUWT and most recently a few days ago. But I again copy the explanation to here for ease of any ‘newcomers’ who want to read it.
None of the models – not one of them – could match the change in mean global temperature over the past century if it did not utilise a unique value of assumed cooling from aerosols. So, inputting actual values of the cooling effect (such as the determination by Penner et al.
http://www.pnas.org/content/early/2011/07/25/1018526108.full.pdf?with-ds=yes )
would make every climate model provide a mismatch of the global warming it hindcasts and the observed global warming for the twentieth century.
This mismatch would occur because all the global climate models and energy balance models are known to provide indications which are based on
1.
the assumed degree of forcings resulting from human activity that produce warming
and
2.
the assumed degree of anthropogenic aerosol cooling input to each model as a ‘fiddle factor’ to obtain agreement between past average global temperature and the model’s indications of average global temperature.
More than a decade ago I published a peer-reviewed paper that showed the UK’s Hadley Centre general circulation model (GCM) could not model climate and only obtained agreement between past average global temperature and the model’s indications of average global temperature by forcing the agreement with an input of assumed anthropogenic aerosol cooling.
The input of assumed anthropogenic aerosol cooling is needed because the model ‘ran hot’; i.e. it showed an amount and a rate of global warming which was greater than was observed over the twentieth century. This failure of the model was compensated by the input of assumed anthropogenic aerosol cooling.
And my paper demonstrated that the assumption of aerosol effects being responsible for the model’s failure was incorrect.
(ref. Courtney RS An assessment of validation experiments conducted on computer models of global climate using the general circulation model of the UK’s Hadley Centre Energy & Environment, Volume 10, Number 5, pp. 491-502, September 1999).
More recently, in 2007, Kiehle published a paper that assessed 9 GCMs and two energy balance models.
(ref. Kiehl JT,Twentieth century climate model response and climate sensitivity. GRL vol.. 34, L22710, doi:10.1029/2007GL031383, 2007).
Kiehl found the same as my paper except that each model he assessed used a different aerosol ‘fix’ from every other model. This is because they all ‘run hot’ but they each ‘run hot’ to a different degree.
He says in his paper:
And, importantly, Kiehl’s paper says:
And the “magnitude of applied anthropogenic total forcing” is fixed in each model by the input value of aerosol forcing.
Thanks to Bill Illis, Kiehl’s Figure 2 can be seen at
http://img36.imageshack.us/img36/8167/kiehl2007figure2.png
Please note that the Figure is for 9 GCMs and 2 energy balance models, and its title is:
It shows that
(a) each model uses a different value for “Total anthropogenic forcing” that is in the range 0.80 W/m^-2 to 2.02 W/m^-2
but
(b) each model is forced to agree with the rate of past warming by using a different value for “Aerosol forcing” that is in the range -1.42 W/m^-2 to -0.60 W/m^-2.
In other words the models use values of “Total anthropogenic forcing” that differ by a factor of more than 2.5 and they are ‘adjusted’ by using values of assumed “Aerosol forcing” that differ by a factor of 2.4.
So, each climate model emulates a different climate system. Hence, at most only one of them emulates the climate system of the real Earth because there is only one Earth. And the fact that they each ‘run hot’ unless fiddled by use of a completely arbitrary ‘aerosol cooling’ strongly suggests that none of them emulates the climate system of the real Earth.
Richard
Thanks, Willis.
richardscourtney says: “True, you have shown that and (as e.g. demonstrated by comments from Greg Goodman in this thread) it really annoys some people.”
Richard, if you read my comments you will see that I agree with a lot of what Willis is saying. It is not what he is suggesting that “annoys” me (your idea not mine) but the fact he makes mistakes, fails to correct and then repeats them.
I correct Willis where I think he makes technically errors. That is usually in the sense of reinforcing his work not dismissing it.
If he could understand that you can’t regress dF on dT when the basic assumed relationship is dT/dt is proportion to F he would look a lot more credible. That is such a fundamental error that it undermines what he is trying to say and makes it easy to anyone is “annoyed” by his observations to ignore them.
One thing that does annoy me a little is people like you and Ian W who can’t be bother the read what I write before trying to criticise it.
BTW Richard, 2.02 W/m^-2
that should be W/m^2 or Wm^-2
Thanks Willis, CO2 forcing is very likely near zero anyway because external IR is absorbed but cannot be thermalised in the gas phase. It thermalises at heterogenities like clouds, surfaces and space. This process may play an important role in your climate governor system.
Courtney paper: (paywalled) http://multi-science.metapress.com/content/b2130335764k31j8/
Very interesting intro, I did not realise the key role of Hadley Centre in WG1 SSU.
This should get more attention.
Greg Goodman:
At October 2, 2013 at 1:50 am you write
I did read what you wrote, and I did not “criticise” it: I correctly said to Willis as example
I am of average height and I would be interested in why you think Ian W is below average height or if that is merely another of your mistaken assertions.
Richard
PS Thankyou for the correction to nomenclature provided in your post at October 2, 2013 at 1:54 am.
Greg Goodman:
I owe you an apology and I write to provide it.
You complained that I do not read what you write and in my annoyance I think I misread your complaint!
I now think you said you were a little annoyed by people
and did not say
you were annoyed by little people.
I was wrong. Sorry.
Richard
So lamda is proportional to the derivative of temperature wrt forcing. Obviously – so what?
The question is what is lambda? The models have a high range, when granted that there is only one corrrect value.
Your argument strikes me as circular.
Willis
I enjoyed reading the above.
I could be wrong but are you conflating statistical modelling with “deterministic” modelling?
1) The one you describe, the trend between temperature and forcing sounds like a statistical climate model.
2) But this isn’t what the IPCC models are doing. As far as I know they perturb the system with the expected increased thermal energy from an observed increase in CO2.
– They then run their models with different feedback assumptions in order to create the training set. But this is only one part of it.
– One must also model convective transfer – during the modelling process – which is a function of finite-gradients from the model resolutions (this is where they loose the plot I think) and the assumed feedbacks and their magnitudes.
– But they also have dampeners such as aerosols to play around with and in coupled models (did temperature result in drier regions) atmosphere-ocean transfer.
I guess what I am trying to say does the Total Forcings = Net Forcing in the above plot.
I never understood the previous post, and I don’t understand this one. The emphasis on trend seems a trivial corollary of what you’d done before.
You’d already established that the following is a pretty accurate black-box representation of model behavior:
\frac{dT}{dt} = \frac{\lambda}{\tau}F – \frac{1}{\tau}T
So you know that the temperature response to a forcing having a trend from time t=0, i.e., to F = rt, is
T = \lambda r \left[ t – \tau(1 – e^{-t/\tau}) \right]
That is, the rate of change of temperature is
r\lambda(1 -\tau^2 e^{-t/\tau}).
If you ignore the transient response, that is, the ratio of the temperature trend, \lambda r, to the forcing trend, r, is the transient climate sensitivity \lambda
This result does not seem startling in light of what you’d established previously.
What am I missing?