Dr. Kiehl's Paradox

Guest Post by Willis Eschenbach

Back in 2007, in a paper published in GRL entitledTwentieth century climate model response and climate sensitivityJeffrey Kiehl noted a curious paradox. All of the various different climate models operated by different groups were able to do a reasonable job of emulating the historical surface temperature record. In fact, much is made of this agreement by people like the IPCC. They claim it shows that the models are valid, physical based representations of reality.

kiehl sensitivity vs total forcing

Figure 1. Kiehl results, comparing climate sensitivity (ECS) and total forcing. 

The paradox is that the models all report greatly varying climate sensitivities but they all give approximately the same answer … what’s up with that? Here’s how Kiehl described it in his paper:

[4] One curious aspect of this result is that it is also well known [Houghton et al., 2001] that the same models that agree in simulating the anomaly in surface air temperature differ significantly in their predicted climate sensitivity. The cited range in climate sensitivity from a wide collection of models is usually 1.5 to 4.5C for a doubling of CO2, where most global climate models used for climate change studies vary by at least a factor of two in equilibrium sensitivity.

[5] The question is: if climate models differ by a factor of 2 to 3 in their climate sensitivity, how can they all simulate the global temperature record with a reasonable degree of accuracy?

How can that be? The models have widely varying sensitivities … but they all are able to replicate the historical temperatures? How is that possible?

Not to give away the answer, but here’s the answer that Kiehl gives (emphasis mine):

It is found that the total anthropogenic forcing for a wide range of climate models differs by a factor of two and that the total forcing is inversely correlated to climate sensitivity.

This kinda makes sense, because if the total forcing is larger, you’ll have to shrink it more (smaller sensitivity) to end up with a temperature result that fits the historical record. However, Kiehl was not quite correct.

My own research in June of this year, reported in the post Climate Sensitivity Deconstructed,  has shown that the critical factor is not the total forcing as Kiehl hypothesized. What I found was that the climate sensitivity of the models is emulated very accurately by a simple trend ratio—the trend of the forcing divided by the trend of the model output.

lambda vs trend ratio allFigure 2. Lambda compared to the trend ratio. Red shows transient climate sensitivity (TCR) of four individual models plus one 19-model average. Dark blue shows the equilibrium climate sensitivity (ECS) of the same models. Light blue are the results of the forcing datasets applied to actual historical temperature datasets.

Note that Kiehl’s misidentification of the cause of the variations is understandable. First, the output of the models are all fairly similar to the historical temperature. This allowed Kiehl to ignore the model output, which simplifies the question, but it increases the inaccuracy. Second, the total forcing is an anomaly which starts at zero at the start of the historical reconstruction. As a result, the total forcing is somewhat proportional to the trend of the forcing. Again, however, this increases the inaccuracy. But as a first cut at solving the paradox, as well as being the first person to write about it, I give high marks to Dr. Kiehl.

Now, I probably shouldn’t have been surprised by the fact that the sensitivity as calculated by the models is nothing more than the trend ratio. After all, the canonical equation of the prevailing climate paradigm is that forcing is directly related to temperature by the climate sensitivity (lambda). In particular, they say:

Change In Temperature (∆T) = Climate Sensitivity (lambda) times Change In Forcing (∆F), or in short,

∆T = lambda ∆F

But of course, that implies that

lambda = ∆T / ∆F

And the right hand term, on average, is nothing but the ratio of the trends.

So we see that once we’ve decided what forcing dataset the model will use, and decided what historical dataset the output is supposed to match, at that point the climate sensitivity is baked in. We don’t even need the model to calculate it. It will be the trend ratio—the trend of the historical temperature dataset divided by the trend of the forcing dataset. It has to be, by definition.

This completely explains why, after years of better and better computer models, the models are able to hindcast the past in more detail and complexity … but they still don’t agree any better about the climate sensitivity.

The reason is that the climate sensitivity has nothing to do with the models, and everything to do with the trends of the inputs to the models (forcings) and outputs of the models (emulations of historical temperatures).

So to summarize, as Dr. Kiehl suspected, the variations in the climate sensitivity as reported by the models are due entirely to the differences in the trends of the forcings used by the various models as compared to the trends of their outputs.

Given all of that, I actually laughed out loud when I was perusing the latest United Nations Inter-Governmental Panel on Climate Change’s farrago of science, non-science, anti-science, and pseudo-science called the Fifth Assessment Report (AR5). Bear in mind that as the name implies, this is from a panel of governments, not a panel of scientists:

The model spread in equilibrium climate sensitivity ranges from 2.1°C to 4.7°C and is very similar to the assessment in the AR4. There is very high confidence that the primary factor contributing to the spread in equilibrium climate sensitivity continues to be the cloud feedback. This applies to both the modern climate and the last glacial maximum.

I laughed because crying is too depressing … they truly, truly don’t understand what they are doing. How can they have “very high confidence” (95%) that the cause is “cloud feedback”, when they admit they don’t even understand the effects of the clouds? Here’s what they say about the observations of clouds and their effects, much less the models of those observations:

• Substantial ambiguity and therefore low confidence remains in the observations of global-scale cloud variability and trends. {2.5.7}

• There is low confidence in an observed global-scale trend in drought or dryness (lack of rainfall), due to lack of direct observations, methodological uncertainties and choice and geographical inconsistencies in the trends. {2.6.2}

• There is low confidence that any reported long-term (centennial) changes in tropical cyclone characteristics are robust, after accounting for past changes in observing capabilities. {2.6.3}

I’ll tell you, I have “very low” confidence in their analysis of the confidence levels throughout the documents …

But in any case, no, dear Inter-Governmental folks, the spread in model sensitivity is not due to the admittedly poorly modeled effects of the clouds. In fact it has nothing to do with any of the inner workings of the models. Climate sensitivity is a function of the choice of forcings and desired output (historical temperature dataset), and not a lot else.

Given that level of lack of understanding on the part of the Inter-Governments, it’s gonna be a long uphill fight … but I got nothing better to do.

w.

PS—me, I think the whole concept of “climate sensitivity” is meaningless in the context of a naturally thermoregulated system such as the climate. In such a system, an increase in one area is counteracted by a decrease in another area or time frame.  See my posts It’s Not About Feedback and Emergent Climate Phenomena for a discussion of these issues.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
5 2 votes
Article Rating
124 Comments
Inline Feedbacks
View all comments
Steve W.
October 2, 2013 8:26 am

It always looked to me that the models were trained to match the warming side of the PDO (ignoring that the PDO existed). That was a pretty much continuous increase in temps. That seems easy to do in a model. Just turn a knob so that modeled temps rise! Done! Reality has become much more complex now that temps or flat or falling slightly. Is it even possible to re-train these models to work now?
They have to show: the rise, a leveling off, perhaps a drop, then their hoped-for rise again (sometime after they are due to retire)?

johnfpittman
October 2, 2013 8:45 am

Greg Goodman and Joe Born: I am curious about the climate sensitivity and the absolute temperature of the models compared to each other. It would be more than just interesting if the ratio Willis is commenting about would indicate a ranking according to the absolute temperature. It would provide a basis for something other than in one sense Willis’s comment of the relationship is mathematically trivial. His use of the equation does not bother me much, since this is one of the simplified equations used to model a linear response system for CS. You can see it in write ups trying to explain climate sensitivity. Not saying such are particularly good or bad. Their use was for concept.

Frank S
October 2, 2013 9:08 am

I expect this has been covered in the past, but something about these confuses me and I rarely see people just dismiss out of hand any “I can prove the past” assertions.
I used to work in pretty deep theoretical aerodynamics. Watching papers presented and the like, heuristic methods were often presented. Some measurements were taken (let’s say with PIV), someone came up with a methodology for predicting the flow velocities based on those measurements, and then a new map was presented based on the equations they had invented. You knew the method was really bad if they couldn’t get the far field correct. But I was rarely impressed with the near field predictions. To give you an idea, the R^2 values were often .3 or lower (this was not easy stuff).
Then I was at a conference where the R^2 values jumped up to about .8 and higher. It took me a little while to figure out what was going on. Basically, to validate their models, they compared to the data used to create the models. I asked about this (and, as a fairly junior guy in the room, I was expecting to get laughed down). Suddenly, a lot of people agreed: You can’t validate your model with the same stuff you used to create your model. It lacks rigor. And, besides…predicting the same data set you used to create a model to generate the data set really shouldn’t be impressive. You need an independent data set (i.e. a different set of trials that should be comparable but imply some tweak to the system, such as different far field velocity or different disturbance size or whatever) to see if you’ve got something worth while.
So, if climate modeling can be done and tested against a data set to prove that it works, they would have to use only past data to create it and use future data to test it. Alternately, they could save a subset for use. For instance, if the theory goes that you need 30 years for a trend, you stop using historical data for model generation in about 1980. With the model thus generated, you then “future-cast” to 2010 and compare how you did (trends being the goal here), without any adjustments for reality (unexpected volcanos or whatever; you should have a statistical model that includes a predicted number of those anyway). If you can’t “backcast” you already failed, so you should check that before doing the “future-cast”, but your success at a backcast really shouldn’t impress anyone.
Or am I totally missing something here?

Salvatore Del Prete
October 2, 2013 9:38 am

The models will never be correct due to the fact they will never have complete,accurate,or comprehensive enough data to begin with, and the proper state of the climate to begin with which in the end results in the forcings they use to be useless.
As the decade goes on the climate forecast that the models have given will be so off, that the IPCC and the models will be obsolete, as the temperature trend will be down in response to very prolonged solar minimum conditions. The exact opposite outcome.

Salvatore Del Prete
October 2, 2013 9:40 am

Dr. Spencer, likely would see it differently from Willis in my opinion,(from what I read from him) but both agree the models are off.

Salvatore Del Prete
October 2, 2013 9:41 am

I have read the models COULD NOT /CAN NOT produce past climate scenarios.

RC Saumarez
October 2, 2013 10:23 am

Born.
You are not missing anything. The only problem is that the climate system does not behave like a simple first order system, or at least its autocorrelation properties suggest that it doesn’t. Also, consideration of the non-linearities in the climate system suggest that it would be very unlikely to behave as non-linear system.
This post seems to be a trivial, circular argument. It would seem unlikely that the “magic” lambda is a constant, but this would mean integrating a product that is probably some way beyond WE’s capacity.

Martin 457
October 2, 2013 10:34 am

Climate models cannot “hindcast”. Their vision is not 20/20. It’s more like 30/50. They cannot predict the past because the past is not linear. They cannot predict the future because, the future hasn’t happened yet.
I do like how people are getting along here though. Respect for each other and all that.
Peace out.

An Inquirer
October 2, 2013 10:49 am

Steven Mosher says on October 1, 2013 at 11:52 pm:
“. . . aerosol forcing ( a free knob) . . . .”
This description of aerosol forcing should be well-remembered and widely distributed. It explains why GCMs can be highly successfuly in back casting but are dramatically unreliable in forecasting.

John West
October 2, 2013 10:51 am

“How can that be? The models have widely varying sensitivities … but they all are able to replicate the historical temperatures? How is that possible?”
Easy, they use variables that can’t be measured (Fudge Factors) to match the past and then speculate into the future.
That’s what gets me with that graph that shows temperature outputs from models with just natural versus natural with anthropogenic compared to observed, anyone with half a brain knows that just as easily could be natural that we know of and natural that we don’t compared to observed. How “scientists” could ever present that as evidence is beyond me.

October 2, 2013 10:57 am

Steven Mosher says:
October 1, 2013 at 11:56 pm
——————————————————-
Paleo is better??? Unless, like Hansen, you accept the paranormal phenomenon of effect preceding cause, ice-core paleo tells you that “climate sensitivity” is effectively zero.

October 2, 2013 11:01 am

Steven Mosher says:
October 1, 2013 at 11:52 pm
yes. this comes as a surprise? check the relationship between aerosol forcing ( a free knob) and the sensitivity of models.

I have written an essay about the sensitivity of climate models to (human) aerosol forcing, used as excuse for the 1945-1975 cooling period, and again rising its ugly head to argument away the current standstill. It was published in Multi-Science E&E together with other essays about “Environment, Climate Change, Energy Economics and Energy Policy”. Unfortunately behind a paywall…

Steve Keohane
October 2, 2013 11:34 am

Thanks again Willis. That sensitivity curve reminds me of the card trick beginning with a fanning of the deck, and the words, ‘Pick a card, any card’.

Bob Shapiro
October 2, 2013 11:55 am

It sounds like you could “make” your own model which could compete with the various existing ones. It might be interesting to:
1. Make a model for the known CO2-only sensitivity, and also one for zero sensitivity, generating temperature prognostications.
2. Verify that your model parameters (but not the output) fall within the spread of existing models
3. Publish the results, making those sensitivities eligible to be included into AR6 (if there is one)

October 2, 2013 12:04 pm

Other than climate sensitivity likely varying with climate change, and cloud feedback not being the only poorly understood contributor to climate sensitivity, I think Kiehl was correct. And that IPCC was correct in stating that climate sensitivity to CO2 change is anywhere in a wide range because the cloud feedback is poorly understood.
The insufficiencies of the models appears to me considering wrong values for forcings other than CO2, such as aerosols, direct effect of solar variation, cloud effects of solar variation, and multidecadal oceanic oscillations. If only all of these and other significant forcings are correctly entered into the models, and then the feedbacks are adjusted to achieve hindcasting, then the models would be more accurate and come up with more accurate figures for climate sensitivity to CO2. Which can change with climate change, due to change in strength of feedbacks, such as the lapse rate one, the cloud albedo one, or the surface albedo one.

October 2, 2013 12:24 pm

Donald L. Klipstein:
In your post at October 2, 2013 at 12:04 pm you say

Other than climate sensitivity likely varying with climate change, and cloud feedback not being the only poorly understood contributor to climate sensitivity, I think Kiehl was correct. And that IPCC was correct in stating that climate sensitivity to CO2 change is anywhere in a wide range because the cloud feedback is poorly understood.
The insufficiencies of the models appears to me considering wrong values for forcings other than CO2, such as aerosols, direct effect of solar variation, cloud effects of solar variation, and multidecadal oceanic oscillations. If only all of these and other significant forcings are correctly entered into the models, and then the feedbacks are adjusted to achieve hindcasting, then the models would be more accurate and come up with more accurate figures for climate sensitivity to CO2. Which can change with climate change, due to change in strength of feedbacks, such as the lapse rate one, the cloud albedo one, or the surface albedo one.

Oh, so you “think Kiehl was correct” except for “climate sensitivity likely varying with climate change”. But Kiehl only considered climate sensitivity in the models (not in reality) and – as he reported – it has a fixed value in each model.
Also, your assertion that the models could “come up with more accurate figures for climate sensitivity to CO2” can only be true if the reason is known for why each model ‘runs hot’, and each ‘runs hot’ by a different degree from each other model. The modelers don’t know why the models ‘run hot’, do you?
I refer you to my post in this thread at October 2, 2013 at 1:37 am. This link jumps to it
http://wattsupwiththat.com/2013/10/01/dr-kiehls-paradox/#comment-1433629
Richard

Greg Goodman
October 2, 2013 1:02 pm

Willis: After all, the canonical equation of the prevailing climate paradigm is that forcing is directly related to temperature by the climate sensitivity (lambda). In particular, they say:
Change In Temperature (∆T) = Climate Sensitivity (lambda) times Change In Forcing (∆F), or in short,
∆T = lambda ∆F
====
No, Willis. This is not the canonical relationship , it is what you reduced the canonical relationship to in eqn 7 of your previous thread under the limit of a physically unreal constant rate of change being reached. ie ∆T1=∆T2 and constant ∆F. This is equivalent to what Joe posted earlier with F=r.t with the transient removed. You are now suggesting this is base relationship, it is not.
The canonical relationship is , to quote from Paul_K’s excellent thread that you used as a basis:
C dT/dt = F(t) – λ *T
T(k) – T(k-1) = Fkα /λ – α * T(k-1)
which is the same as what I posted earlier:
T2 = T1 + lambda (F2 – F1) (1 – exp(-dt/tau)) + exp(-dt/tau) (T1 – T0)
(T2-T1) – exp(-dt/tau) (T1 – T0) = lambda (F2 – F1) (1 – exp(-dt/tau))
(where dt is the discrete data time interval which is not a dimensionless unity ).
So what is your trend ratio telling us? If you draw a straight line through the input time series and a straight line through the output temp TS. and take the ratio , it gives a constant, by definition. Not a result.
That the result is close to climate sensitivity, eye-balling your graph the largest deviation looks like about 10%.
is that surprising?
Most of the low level relationships are linear. Many are linear negative feedbacks. Most obvious exception Stephan-Boltzmann T^4, but local variation of the order of 30 degree C is only about 10% at most. The bits the most likely to be non linear : tropical storms , cloud formation and precipitation are precisely the bits that are so poorly understood that they are not modelled at all and replaced by WAG “parameters”.
One of the easiest ways to control a non linear, unstable system it to add neg. feedback. Earth has proved to be long term stable, so anything close to matching a hind-cast almost certainly will be stable too and dominated by neg. f/b.
Is there much scope for something constrained by hind-cast to be much different from what you remarked on?
Perhaps it would be informative to look at the one or two models that lie off the line. What is special about them ? Are they less/more stable?
I agree you do seem to have pinned it down better than Kiele’s initial observation.
I think Mosh has hit it on the head. There are enough free knobs in all this to make any lambda fit.
The current IPCC position is untenable. Having been unable to put a figure on CS, they are no longer able to say what proportion of recent warming is AGW.

Greg Goodman
October 2, 2013 1:23 pm

Willis: “Because it means that the shape and form of the model are immaterial for climate sensitivity. It is determined only by the ratio of the trends of the output and the forcing. ”
It makes no sense to say the sensitivity (a parameter) of a model is determined b its output.
“Compare and contrast that with the IPCC claim, that they are “very certain” (95%) that the spread in the reported sensitivities is due to differences in “cloud feedback” … you may not be startled by my results, but the IPCC certainly would be …”
Aren’t the cloud feedback “parameters” also input parameters? I would have thought that what they choose to use as cloud parametrisations almost certainly does determine what they choose to use for CS.
That’s one of the few things they can be 95% sure of since they draw both cards from the bottom of the pack.

October 2, 2013 1:42 pm

Greg Goodman:
You make a good point at October 2, 2013 at 1:23 pm when you write

Aren’t the cloud feedback “parameters” also input parameters? I would have thought that what they choose to use as cloud parametrisations almost certainly does determine what they choose to use for CS.
That’s one of the few things they can be 95% sure of since they draw both cards from the bottom of the pack.

However, the modelers needed to hindcast the global temperature evolution of the twentieth century. And that evolution was not a linear rise.
Each model ‘runs hot’. That is, each model tends to increase global temperature over time more than was observed over the twentieth century. And each model ‘runs hot’ by a different amount. Each model is tuned to hindcast the twentieth century by adjusting aerosol cooling arbitrarily to constrain the ‘running hot’ while also adjusting CS to obtain the evolution of global temperature (which was not a linear rise).
It is that tuning of two parameters which provides the value of CS. Of course, as you suggest, instead of adjusting aerosol cooling they could have adjusted cloud behaviour or some other assumed parameter. But that would not be likely to much affect the needed value of CS in each model.
Simply, the models are basically curve fitting exercises and, therefore, it is not surprising that Willis can emulate their behaviour(s) with a curve fitted model.
Richard

October 2, 2013 2:47 pm

Willis said at http://wattsupwiththat.com/2013/10/01/dr-kiehls-paradox/#more-94963
“I think the whole concept of “climate sensitivity” is meaningless in the context of a naturally thermoregulated system such as the climate.
Hi Willis,
I agree that ”climate sensitivity” is meaningless, but perhaps for different reasons.
The only signal apparent in the modern data record is that dCO2/dt changes very soon AFTER temperature and CO2 LAGS temperature by ~9 months.
http://icecap.us/index.php/go/joes-blog/carbon_dioxide_in_not_the_primary_cause_of_global_warming_the_future_can_no/
Atmospheric CO2 also LAGS temperature in the ice core record by ~800 years on a longer time scale.
So atmospheric CO2 LAGS temperature at all measured time scales.*
So “climate sensitivity”, as used in the climate models cited by the IPCC, assumes that atmospheric CO2 primarily drives temperature, and thus assumes that the future is causing the past. I suggest that this assumption is highly improbable.
Regards, Allan
______
Post Script:
* This does not preclude the possibility that humankind is causing much of the observed increase in atmospheric CO2, not does it preclude the possibility that CO2 is a greenhouse gas that causes some global warming. It does suggest that neither of these phenomenon are catastrophic or even problematic for humanity or the environment..
As regards humanity and the environment, the evidence suggests that both increased atmospheric CO2 and slightly warmer temperatures are beneficial.
Finally, the evidence suggests that natural climate variability is far more significant and dwarfs any manmade global warming, real or imaginary. This has been my reasoned conclusion for the ~three decades that I have studied this subject, and it continues to enable a more rational understanding of Earth’s climate than has been exhibited by the global warming alarmists and the IPCC.

October 2, 2013 3:23 pm

Dear Mr. Eschenbach:
To answer your question I once spent some time working through the fortran actually comprising the majority of the model. What I found was a handful of 1960s equations coupled with dozens of functions that didn’t do anything and lots of “parametrization” whose primary consequence was making the thing’s retrocasts come out about right.
Bottom line: it doesn’t matter what assumptions you claim the model reflects if, in reality, it is internally adjusted to produce commonly accepted historical data – i.e. retrocasting predicts forecasting and competing models matching the same retro data therefore produce the same forecasts regardless of “built in” assumptions and nominal parameterization.

HR
October 2, 2013 3:43 pm

I’ve noticed other examples where there seems to be disconnects in the logic of the confidence ratings. For example somewhere in the document they suggest the pause is caused in equal measure by internal variability and reduced rate of change of forcings in the past decade or so and give this high confidence. But then go on to express a much low confidence on quantifying the forcings.
Assuming these guys aren’t idiots and also aren’t simply dishonest then the only reason such disconnects exist is because the confidences come from many different sources. Confidence can come from quality of data sets, understanding of processes or from professional opinion (maybe there are more). So when it comes to clouds it might be that we don’t understand the physics or have good, long data sets but it might be the professional opinion of many climate scientists that it’s cloud feedback that cause the model spread. Viola! you have high-confidence that springs from a low level of understanding.
(Note This is just my attempt to try to understand the reasoning of the IPCC, in no way does it mean I think it’s a reasonable way to proceed.)

RC Saumarez
October 2, 2013 4:27 pm

@Willis Eschenbach
I think that modelling climate is quite a serious biusiness. I actually think it is incredibly difficult.
If I were to model volcanic activity, i would regard it as a multiplier of the forcing, not an additive effect. That is if a lot of ash goes into the atmosphere, it would diminish the forcing by 10% or whatever.
In the case to recover the influence of volcanic ash on temperature would involve a homomorphic deconvolution. Given the fact the the climate response is certainly non-linear, one would be on a hiding to nothing using the awful data that is available.
If I were to try and take this problem on, which I am not because it is not my field, it would require at least 6 months serious work. However, this does not mean that I, and others who have a mathematical and technical education, cannot comment on the use of basic maths and satatistics. We can because we know a lot more about these methods than your good self. However, you seem to churn out what you believe is profound analysis several times a week. You should ask yourself if what you write makes any sense before posting it and then give the problem some serious thought.
Your analysis is both circular and naive.
I’m sorry that you are unable to respond to the valid criticism made by educated scientists about your posts. Simply ranting at people is not argument. I am perfectly capable of having a rational discussion about elementary signal processing and statistical methods with anyone and am prepared to be corrected by those who have greater expertise than myself.
Why not start with trying to determine if your “models” encapsulate the non-linear characteristics of the data and see how far you get?