A Modest Proposal—Forget About Tomorrow

Guest Post by Willis Eschenbach

There’s a lovely 2005 paper I hadn’t seen, put out by the Los Alamos National Laboratory entitled “Our Calibrated Model has No Predictive Value” (PDF).

Figure 1. The Tinkertoy Computer. It also has no predictive value.

The paper’s abstract says it much better than I could:

Abstract: It is often assumed that once a model has been calibrated to measurements then it will have some level of predictive capability, although this may be limited. If the model does not have predictive capability then the assumption is that the model needs to be improved in some way.

Using an example from the petroleum industry, we show that cases can exist where calibrated models have no predictive capability. This occurs even when there is no modelling error present. It is also shown that the introduction of a small modelling error can make it impossible to obtain any models with useful predictive capability.

We have been unable to find ways of identifying which calibrated models will have some predictive capacity and those which will not.

There are three results in there, one expected and two unexpected.

The expected result is that models that are “tuned” or “calibrated” to an existing dataset may very well have no predictive capability. On the face of it this is obvious—if we could tune a model that simply then someone would be predicting the stock market or next month’s weather with good accuracy.

The next result was totally unexpected. The model may have no predictive capability despite being a perfect model. The model may represent the physics of the situation perfectly and exactly in each and every relevant detail. But if that perfect  model is tuned to a dataset, even a perfect dataset, it may have no predictive capability at all.

The third unexpected result was the effect of error. The authors found that if there are even small modeling errors, it may not be possible to find any model with useful predictive capability.

To paraphrase, even if a tuned (“calibrated”) model is perfect about the physics, it may not have predictive capabilities. And if there is even a little error in the model, good luck finding anything useful.

This was a very clean experiment. There were only three tunable parameters. So it looks like John Von Neumann was right, you can fit an elephant with three parameters, and with four parameters, make him wiggle his trunk.

I leave it to the reader to consider what this means about the various climate models’ ability to simulate the future evolution of the climate, as they definitely are tuned or as the study authors call them “calibrated” models, and they definitely have more than three tunable parameters.

In this regard, a modest proposal. Could climate scientists please just stop predicting stuff for maybe say one year? In no other field of scientific endeavor is every finding surrounded by predictions that this “could” or “might” or “possibly” or “perhaps” will lead to something catastrophic in ten or thirty or a hundred years. Could I ask that for one short year, that climate scientists actually study the various climate phenomena, rather than try to forecast their future changes? We still are a long ways from understanding the climate, so could we just study the present and past climate, and leave the future alone for one year?

We have no practical reason to believe that the current crop of climate models have predictive capability. For example, none of them predicted the current 15-year or so hiatus in the warming. And as this paper shows, there is certainly no theoretical reason to think they have predictive capability.

The models, including climate models, can sometimes illustrate or provide useful information about climate. Could we use them for that for a while? Could we use them to try to understand the climate, rather than to predict the climate?

And 100 and 500 year forecasts? I don’t care if you do call them “scenarios” or whatever the current politically correct term is. Predicting anything 500 years out is a joke. Those, you could stop forever with no loss at all

I would think that after the unbroken string of totally incorrect prognostications from Paul Ehrlich and John Holdren and James Hansen and other failed serial doomcasters, the alarmists would welcome such a hiatus from having to dream up the newer, better future catastrophe. I mean, it must get tiring for them, seeing their predictions of Thermageddon™ blown out of the water by ugly reality, time after time, without interruption. I think they’d welcome a year where they could forget about tomorrow.

Regards to all,

w.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

299 Comments
Inline Feedbacks
View all comments
November 6, 2011 5:53 am

Leif writes “And you don’t consider that an important issue? There are many models in use in the petroleum industry [e.g. TDAS] and they work very well.”
Of course I consider it important. I’d never actually considered they might be using a tiny 3 parameter curve fitting model when they have real models to chose from. If they’re using a real model behind the paper then that supports my point of view. If they’re simply curve fitting then that supports your stated point of view.
That cuts both ways as I’m sure you’d appreciate. If they’re using a real model behind their paper then you’re going to have to change your opinion surely.

November 6, 2011 7:52 am

TimTheToolMan says:
November 6, 2011 at 5:53 am
That cuts both ways as I’m sure you’d appreciate. If they’re using a real model behind their paper then you’re going to have to change your opinion surely.
The model had two parts: one that was calculating the flow using a standard procedure [that was not tuned] and the other was curve fitting of the shape of the reservoir. The paper was solely concerned with errors in the latter. And the result was that mere curve fitting [random variation of the parameter] does not have predictive power. This is clear without any elaborate considerations. Now, if the parameters were actually measured or otherwise physically determined, the model would have significant power. This is why such models are used [and are useful] in the petroleum industry. The situation carries over to climate the same way: if a parameter is played with and chosen just because it gives a better fit [tuning] I would not expect the model to gain improved predictive power, but if a parameter is calibrated better from physical measurements or theory, I would certainly expect the model to improve. I’m at a loss why that is not obvious, but perhaps I should not be presumptuous about what is obvious.

November 6, 2011 2:36 pm

Leif : “one that was calculating the flow using a standard procedure [that was not tuned]”
Well immediately I can see a problem with your expectation. One of the parameters was permeability and that is clearly to do with flow rates and this one was tuned.

November 6, 2011 3:17 pm

TimTheToolMan says:
November 6, 2011 at 2:36 pm
Well immediately I can see a problem with your expectation. One of the parameters was permeability and that is clearly to do with flow rates and this one was tuned.
Of course, that was an input to the calculation and when input parameters are tuned to obtain the best fit, the model loses predictability.

November 6, 2011 3:34 pm

Leif “Of course, that was an input to the calculation and when input parameters are tuned to obtain the best fit, the model loses predictability.”
That was a parameter that was tuned to attempt to find the best value to apply to “average” permeability and hence make flow predictions. This is precisely analogous to the Charnock parameter you quoted earler that described an average surface roughness to be used.

Spector
November 6, 2011 4:40 pm

I am somehow reminded of a circa 1972 science fiction story, “When HARLIE was One,” by David Gerrold, which was about an artificial human intelligence, who, as I recall from almost 40 years ago, was tasked with building a super computer called G.O.D. which was supposed to find the answer to all the world’s problems. David Gerrold is also famous as the author of the “Trouble with Tribbles” Star Trek episode.

November 6, 2011 8:03 pm

TimTheToolMan says:
November 6, 2011 at 3:34 pm
That was a parameter that was tuned to attempt to find the best value to apply to “average” permeability and hence make flow predictions. This is precisely analogous to the Charnock parameter you quoted earler that described an average surface roughness to be used.
The difference is that the Charnock parameter is measured and not nilly-willy fitted[i.e. tuned] to have the overall fit be better. Try to read some of my discussion with Richard to see if you can grok the difference.

November 6, 2011 8:31 pm

Leif “The difference is that the Charnock parameter is measured and not nilly-willy fitted[i.e. tuned] to have the overall fit be better. ”
Do you think permeability isn’t a well known and “measured from rock and sand samples” quantity? The application of that parameter is almost certainly a “last mile” application to get the “expected from physics” value of the permeability calculated from the sand and rock types, moisture and whatever is used in permeability calculations believed to be in the basin being modelled into line with the reality that is measured. From what I can see, the Charnock parameter is no different and is an approximation based on an imperfect knowledge of features in the area.
I couldn’t see any discussion with Richard on this. I saw some with Rational Debate that might have been relevent. Do you want to give me a posting time or something?

November 6, 2011 9:38 pm

TimTheToolMan says:
November 6, 2011 at 8:31 pm
From what I can see, the Charnock parameter is no different and is an approximation based on an imperfect knowledge of features in the area.
Yes, there is no difference, except that the Charnock parameter is measured or at least estimated and not determined by tuning, which is varying the parameter at random until the fit with the past is optimal, and then hoping that that [unphysical] fit also predicts the future [which it might not do].
I couldn’t see any discussion with Richard on this. I saw some with Rational Debate that might have been relevent. Do you want to give me a posting time or something?
I meant Rational Debate, of course.

November 6, 2011 10:54 pm

Leif writes “Yes, there is no difference, except that the Charnock parameter is measured or at least estimated and not determined by tuning, which is varying the parameter at random until the fit with the past is optimal”
Who says its totally random? They’re certainly not saying there was an unlimited range on the values used. So if their tuning was constrained to be within what might be expected as possible values given the expected theoretical permeability from drill cores and whatever, then you’d agree that this paper does damn your argument?

November 6, 2011 11:29 pm

Having look again at the paper they say
“The ranges that the model parameters were allowed to take are: h 2 (0; 60), kg 2
(100; 200) and kp 2 (0; 50).”
And it looks like they’ve explored all the possibilities so none but the truth values gave them predictive power. And then once there were small errors introduced into the model (by perturbing the permeability) no values were predictive at all.
So whilst it may be arguable that they allowed values far away from a reality (I dont know what realistic values may have been) it also appears to be true that they tested for values close to reality.

November 7, 2011 6:47 am

TimTheToolMan says:
November 6, 2011 at 10:54 pm
So if their tuning was constrained to be within what might be expected as possible values given the expected theoretical permeability from drill cores and whatever, then you’d agree that this paper does damn your argument?
So whilst it may be arguable that they allowed values far away from a reality (I dont know what realistic values may have been) it also appears to be true that they tested for values close to reality.

To repeat the melt pond example: Allowed values for when melting occurs are the 12 months. If tuning shows that this test ‘if month = October or month = July or month = February then …’ gives the best fit then that does not improve predictability, but if calculation of the insolation is used or even an approximation [if month = June or month = July] then since that is based on sound physics, one could expect improved predictability.
We assume that if all the parameters were set to their true values, then their model should work and give useful prediction. The problem is that we do not know which one of the 7000 ‘predictions’ given is the right one.

November 7, 2011 5:40 pm

Leif writes “We assume that if all the parameters were set to their true values, then their model should work and give useful prediction.”
And we see by experiment that if the parameters aren’t exactly correct or even if they are exactly correct but there a small errors in the model then there is no predictive power in the model. This contradicts your fundamental assumption. And its not “intuition” its an actual demonstrable result.

November 7, 2011 6:29 pm

TimTheToolMan says:
November 7, 2011 at 5:40 pm
And we see by experiment that if the parameters aren’t exactly correct or even if they are exactly correct but there a small errors in the model then there is no predictive power in the model. This contradicts your fundamental assumption. And its not “intuition” its an actual demonstrable result.
I think you are not getting the fundamental point: if the model is tuned by just varying the parameters to produce the best fit then one cannot expect predictive power, but if the parameters are measured or constrained by physics independent of the model, then the model works. Otherwise the widely used model in question would be useless [and it is not – the petroleum industry makes a lot of money from well-performing models], because we can never get the parameters exactly right. But we can measure them [even with some errors] and with the measured values the model perform great. Same thing with the climate [or any other model for that matter].

November 7, 2011 7:00 pm

Leif writes : “Otherwise the widely used model in question would be useless”
I’m not saying models are useless, they are very useful in their ability to do “what if” scenarios where parameters are tweaked in various ways to give an insight into possible futures. However they dont forcast the future and they’re often simply wrong. They give a good starting point for further real investigations but aren’t the end result in themselves.
In other words if the model says 3C per doubling of CO2 then thats an indication that CO2 could warm the atmosphere. And nothing else. Its certainly not an indication that the atmosphere will warm 3C with a doubling of CO2.
There have been a number of real life examples in this thread where the model predicts something and reality is different. The semi-conductor and aeronautical industries were mentioned.

November 7, 2011 7:49 pm

TimTheToolMan says:
November 7, 2011 at 7:00 pm
I’m not saying models are useless, they are very useful in their ability to do “what if” scenarios where parameters are tweaked in various ways to give an insight into possible futures.
You are contradicting yourself. The climate models are “what if” scenarios. E.g. “what if” CO2 increases at current rate, “what if” solar radiation changes, “what if” land-use changes. Their predictions are ‘possible futures’. To project a possible future, the model must have predictive power.
And you have still not grasped the fundamental point: if we tweak parameters to get the best fit to the past, predictive powers do not follow. If we update the parameters because we have learned some physics and can represent the physics better, the expectation is improved prediction.

November 8, 2011 4:31 am

Leif writes “Their predictions are ‘possible futures’. To project a possible future, the model must have predictive power.”
No it doesn’t have to have predictive power at all because the model results will almost certainly be total rubbish for future prediction. Models can be run and tweaked with known values too, they’re not all about future prediction in the sence GCMs are.
For example I expect the model that tests a new CPU design will be all about trying to make each transistor run within its spec and work out where propogation delays adversely affect its performance. The future prediction in this case is represented by the time to perform say an register store and the model will actually achieve that register store but will model rubbish about the performance.
This is analogous to modelling the climate. You may think all the bits are working correctly (sure those small errors dont matter) but the end result is rubbish because hey they do.
I can see that you’re going to want to argue the case for “so many” GCMs all getting about the same results but as Willis has mentioned before those models have fundamentally different sensitivites. They cant all be right. And because they’re all the same, its a very strong indication they’re all plagued with confirmation bias.
Leif writes “And you have still not grasped the fundamental point: if we tweak parameters to get the best fit to the past, predictive powers do not follow.”
I have grasped that perfectly well. My point is even more fundamental than that.The models are not producing correct predictions period. No manner of tweaking or new physics or even physics improvements will improve them because like the model in this article the models are in error (much more than the 1% used in the paper in this article and always will be in my life anyway) and cannot give a correct result.

November 8, 2011 4:54 am

TimTheToolMan says:
November 8, 2011 at 4:31 am
I have grasped that perfectly well. My point is even more fundamental than that.The models are not producing correct predictions period.
So you are maintaining that no model can ever produce correct predictions. Do you know of any model that does? Or are all models always wrong? How about the models of stellar evolution?

RACookPE1978
Editor
November 8, 2011 6:30 am

Leif Svalgaard says:
November 8, 2011 at 4:54 am
TimTheToolMan says:
November 8, 2011 at 4:31 am
I have grasped that perfectly well. My point is even more fundamental than that.The models are not producing correct predictions period.
So you are maintaining that no model can ever produce correct predictions. Do you know of any model that does? Or are all models always wrong? How about the models of stellar evolution?

To continue the original intent of his question – perhaps to challenge your replies above.
A finite element model – which is what these glorified circulation models are – is only an approximation of the real world. FEA models are used reliably tens of thousands of times every day in engineering models. And they do produce predictable results that are close to what the real world of crystals and metals and thermal transfer and motions and stress and strains and mold relaxation pressures actually are.
But these FEA only work reliably to produce near-real-world results – that is, they only predict real world solutions accurately – as pointed out above as well, when the FEA “cubes”are near-uniform in shape,
when the total “nest” of all of the cubes most closely approximates the actual thing to be modeled,
when the “information” transferred across each and EVERY boundary is completely and accurately defined by the partial differential equations and boundary value equations of each and every cell,
when the “information” of a flow approximation actually “approximates” the actual flow – and even then, simple pipe turbulence and lamina flow in a simple round pipe going through an elbow fails!
Now. look at the circulations models. They don’t have uniform cubes: they use 100×100 km grids that are too thin in height to model the atmosphere, they don’t change areas as they approach the poles, and they don’t model “flow” accurately enough to even approximate the jet streams or cold fronts or storms or even hurricanes in the atmosphere. They don’t model flow through those cubes accurately: you cannot see even “artificial” AMO and PDO changes, El Nino’s and La Nina, or even routine tropical monsoons or changes in the atmosphere or deserts or the Arctic or the savannahs or heavy rainfalls over forests and jungles. They don’t contain “all” of the information that must be exchanged: only what the NCAR physicists “think” they need to model.
If you think the models are accurate: list exactly what “information” is exchanges at every boundary: Consider evaporation for example. Do they model sea conditions for each cube? Land? How accurately? Are sea conditions changes each night and day? Are storms approximated? Can you see the effects of the doldrums? Trade winds? Do winds in each land area accurately follow the real world in every cube? Do Co2 levels change in every cube as we now they actually do? How are cubes changed as the winds cross the Cascades? the AU desert? India’s lowlands and their mountains? Can you show me snowfall in the Alps and in Sweden and on the Mt Kilimanjaro at its elevation? Does the snow fall accurately – so albedo’s change correctly every month?
The real world is NOT uniform averages of solar light in, clouds above, surfaces below, mountains, seas, lakes, and coastlines. They don’t model sunlight at every latitude. They don’t model clouds differently at every cube, at very half-cube or even every 10×10 km area. They don’t change solar radiation over the year. They don’t rotate the “earth”. they don’t model gravity of the water-salt-temperature changes and Coriolis-created currents. They don’t start from known boundary conditions – they let the “model” run for thousands of “days” and assume that that result is an “average” earth worldwide.
You claim for parameters seems particularly strained: The only way for the global models to “work” in hindcasts is for each model to differently “assume” (uniquely change (er, model) its parameters for aerosols and reflectivity for each year between 1950 and 2010) so the result temperatures do match what actually happened in those years.
That very change in aerosol levels alone means the models have no predictive ability.
(2) If you disagree with the above characterizations, please direct me to a graduate level text that does define all the parameters and equations for your favorite models. What text is best? What paper does list every cube and every parameter used?
(3) Let us not get into solar models in this thread: There are 10^54 reasons to wonder about many standard solar theories. 8<)

November 8, 2011 10:12 am

RACookPE1978 says:
November 8, 2011 at 6:30 am
(2) If you disagree with the above characterizations, please direct me to a graduate level text that does define all the parameters and equations for your favorite models. What text is best? What paper does list every cube and every parameter used?
Here is a model: http://www.leif.org/EOS/CAM3-Climate-Model.pdf and here is a textbook: http://www.stanford.edu/group/efmh/FAMbook/FAMbook.html

November 8, 2011 12:03 pm

RACookPE1978 says:
November 8, 2011 at 6:30 am
(3) Let us not get into solar models in this thread: There are 10^54 reasons to wonder about many standard solar theories.
That is a cop-out. The stellar evolution models successfully predict the ‘climate’ on stars 10+ billion years ahead. A stellar example of successful prediction.

Editor
November 8, 2011 1:02 pm

Leif
I have been carrying out research into old climate records at the Met office archives. I was most struck by the frequent references to sightings of the aurora borealis in southern england between around 1550 and 1620.
Is there any scientific reason for this?
tonyb

RACookPE1978
Editor
November 8, 2011 5:15 pm

Leif Svalgaard says:
November 8, 2011 at 10:12 am
1. PDF saved, thank you.
2. Ordered.

November 8, 2011 9:54 pm

climatereason says:
November 8, 2011 at 1:02 pm
frequent references to sightings of the aurora borealis in southern england between around 1550 and 1620. Is there any scientific reason for this?
Yes, solar activity was high then
RACookPE1978 says:
November 8, 2011 at 5:15 pm
2. Ordered.
You’ll not be disappointed. I found it a very good read [and learning experience].

1 10 11 12