Guest Post by Willis Eschenbach
There’s a lovely 2005 paper I hadn’t seen, put out by the Los Alamos National Laboratory entitled “Our Calibrated Model has No Predictive Value” (PDF).
Figure 1. The Tinkertoy Computer. It also has no predictive value.
The paper’s abstract says it much better than I could:
Abstract: It is often assumed that once a model has been calibrated to measurements then it will have some level of predictive capability, although this may be limited. If the model does not have predictive capability then the assumption is that the model needs to be improved in some way.
Using an example from the petroleum industry, we show that cases can exist where calibrated models have no predictive capability. This occurs even when there is no modelling error present. It is also shown that the introduction of a small modelling error can make it impossible to obtain any models with useful predictive capability.
We have been unable to find ways of identifying which calibrated models will have some predictive capacity and those which will not.
There are three results in there, one expected and two unexpected.
The expected result is that models that are “tuned” or “calibrated” to an existing dataset may very well have no predictive capability. On the face of it this is obvious—if we could tune a model that simply then someone would be predicting the stock market or next month’s weather with good accuracy.
The next result was totally unexpected. The model may have no predictive capability despite being a perfect model. The model may represent the physics of the situation perfectly and exactly in each and every relevant detail. But if that perfect model is tuned to a dataset, even a perfect dataset, it may have no predictive capability at all.
The third unexpected result was the effect of error. The authors found that if there are even small modeling errors, it may not be possible to find any model with useful predictive capability.
To paraphrase, even if a tuned (“calibrated”) model is perfect about the physics, it may not have predictive capabilities. And if there is even a little error in the model, good luck finding anything useful.
This was a very clean experiment. There were only three tunable parameters. So it looks like John Von Neumann was right, you can fit an elephant with three parameters, and with four parameters, make him wiggle his trunk.
I leave it to the reader to consider what this means about the various climate models’ ability to simulate the future evolution of the climate, as they definitely are tuned or as the study authors call them “calibrated” models, and they definitely have more than three tunable parameters.
In this regard, a modest proposal. Could climate scientists please just stop predicting stuff for maybe say one year? In no other field of scientific endeavor is every finding surrounded by predictions that this “could” or “might” or “possibly” or “perhaps” will lead to something catastrophic in ten or thirty or a hundred years. Could I ask that for one short year, that climate scientists actually study the various climate phenomena, rather than try to forecast their future changes? We still are a long ways from understanding the climate, so could we just study the present and past climate, and leave the future alone for one year?
We have no practical reason to believe that the current crop of climate models have predictive capability. For example, none of them predicted the current 15-year or so hiatus in the warming. And as this paper shows, there is certainly no theoretical reason to think they have predictive capability.
The models, including climate models, can sometimes illustrate or provide useful information about climate. Could we use them for that for a while? Could we use them to try to understand the climate, rather than to predict the climate?
And 100 and 500 year forecasts? I don’t care if you do call them “scenarios” or whatever the current politically correct term is. Predicting anything 500 years out is a joke. Those, you could stop forever with no loss at all
I would think that after the unbroken string of totally incorrect prognostications from Paul Ehrlich and John Holdren and James Hansen and other failed serial doomcasters, the alarmists would welcome such a hiatus from having to dream up the newer, better future catastrophe. I mean, it must get tiring for them, seeing their predictions of Thermageddon™ blown out of the water by ugly reality, time after time, without interruption. I think they’d welcome a year where they could forget about tomorrow.
Regards to all,
w.
Frank : “It just means that with *that particular* future data set (a mere 21 data points in our case), there are other solutions for the parameters that happen to give a higher peak.”
But thats the point. With incorrect parameters you just dont know what you’re predicting. Suggesting that with a different future data set, the results would be different and somehow “better” misses the point.
And to top it all off were only considering the no-error versions of the model. There are no correct predictions to be made when there are slight errors introduced (ie +/- 1% in one aspect).
So bringing this back to GCMs, they are certainly flawed and therefore cant predict the future no matter how well they are tuned. This result is especially important when considering interpolating results to regional climates. Its little wonder its not (often) done.
Leif Svalgaard:
It is clear that you are being deliberately obtuse when (at November 2, 2011 at 6:05 am ) you write:
“Richard S Courtney says:
November 2, 2011 at 1:27 am
“Therefore, in accordance with my post that you have not replied, I assume your bluster is a clear admission by you that you know you are wrong.”
One more time: your post did not contain a question or anything for me to reasonably respond to. It just stated your opinion. Perhaps you could be specific and say again what is troubling you?”
But I wrote at November 1, 2011 at 4:05 pm:
“To save others finding it, I copy my point here (below) and perhaps you can
(a) try to address it
(b) explain what you think is “muddled” in it
and
(c) explain what you think is not factual but is merely “opinion” in it .”
And my fundamental point was;
“No model (of any kind) should be assumed to have more predictive skill than it has been demonstrated to possess.
That point is not “opinion”: it is fact.
Or do you want to explain what degree of trust can be justified – and how it can be justified – for the predictions of a model that has no demonstrated predictive skill?
And I said;
“Assuming undemonstrated model predictive skill is pseudoscience of precisely the same type as astrology. But no climate model has existed for 30, 50 or 100 years and, therefore, it is not possible for any of them to have demonstrated predictive skill over such times.”
Those are two more facts. Or do you want to try to dispute them.
Your bluster does not hide your evasion of this issue.
Richard
Richard S Courtney says:
November 3, 2011 at 5:10 am
It is clear that you are being deliberately obtuse
I’ll disregard this silly accusation and ascribe it to your lack of emotional stability
“No model (of any kind) should be assumed to have more predictive skill than it has been demonstrated to possess.
The issue was that models are supposed to be ‘tuned’ to agree with observations and thus will always agree [eventually with a lag]. The assumption is that we [by working hard on this] can get better and better models and that they have predictive skill until proven otherwise. Every time a prediction fails we learn something new and can improve the models. This is how science [of any stripe] operates.
Richard S Courtney says:
November 3, 2011 at 5:10 am
Or do you want to try to dispute them.
It is perhaps somewhat presumptuous to think that your opinion deserves a response…
Willis said:
“In no other field of scientific endeavor is every finding surrounded by predictions that this “could” or “might” or “possibly” or “perhaps” will lead to something”
You got that right. I’ve been saying something similar for years. Even when they aren’t making predicitons, they are still guilty of drawing conclusions that are not justified by the strength of the evidence. Every finding leads to a conclusion. The problem lies with interpretations of the past as well as the future. For instance, when you read about someone analyzing a single core sample, and then making broad statements about global conditions.
In my favorite field, cosmology, the experts aren’t afraid to say things like “the evidence is inconclusive” or “there are several popular theories”. Another difference between a field like that and climate study is that it seems like astronomers prefer to have peers check their work and validate thier findings before they release any statements or publish anything.
Gary Swift says:
November 3, 2011 at 10:38 am
the experts aren’t afraid to say things like “the evidence is inconclusive” or “there are several popular theories
It is interesting that there are no global climate models [that I know of] written by, run by, or otherwise pushed by skeptics.
Still waiting for a citation, Mosh, see below … either post’em or admit you haven’t got’em. I’m good either way.
w.
Willis Eschenbach says:
October 31, 2011 at 8:51 pm
@Leif Svalgaard,
It seems to me that the claims you are making and saying are different from tuning, are defined by you the same as what’s already been defined, ergo what you say isn’t different. Maybe things have changed considerably in the most recent years such that tuning isn’t used; but here are some references though about GCM and tuning: that is matching parameters to observations over successive model runs to minimize differences (exactly what this paper in the head post is all about).
– http://www.ipcc.ch/publications_and_data/ar4/wg1/en/ch8s8-1-3.html
“8.1.3 How Are Models Constructed? ”
“Some of these parameters can be measured, at least in principle, while others cannot. It is therefore common to adjust parameter values (possibly chosen from some prior distribution) in order to optimise model simulation of particular variables or to improve global heat balance. This process is often known as ‘tuning’.”
– http://iopscience.iop.org/1748-9326/3/1/014001/fulltext
” A tuning experiment is carried out with the Community Atmosphere Model version 3, where the top-of-the-atmosphere radiative balance is tuned to agree with global satellite estimates from ERBE and CERES, respectively, to investigate if the climate sensitivity of the model is dependent upon which of the datasets is used.”
– http://www.nature.com/nature/journal/v430/n7001/full/430737a.html
“Although it is often true that (in John von Neumann’s words) “the justification of such a mathematical construct is solely and precisely that it is expected to work”, practical data from instruments must be used to set limits on the physically realistic range for these parameters.”
– http://webcache.googleusercontent.com/search?q=cache:0zkMM0VjhIMJ:www.mcs.anl.gov/uploads/cels/papers/P819.ps.Z+&cd=10&hl=en&ct=clnk&gl=us Argonne National Laboritory
“Sensitivity Analysis and
Parameter Tuning of a Sea-Ice
Model”
“We also illustrate the
effectiveness of using these sensitivity derivatives with an optimization algo-
rithm to tune the parameters to maximize the agreement between simulated
results and observational data.”
– http://www.jamstec.go.jp/frsgc/research/d3/jules/philtrans_final.pdf
“The seasonally-averaged differences between
the control run and the target of NCEP seasonal surface air temperature fields is
shown on at the top. Below, the ensemble mean shows a marked improvement after
5 parameters were tuned. Precipitation fields were also used as a target in this
experiment, but tuning was unable to significantly improve the model fields (not
shown), thus demonstrating that there were substantial structural problems and
motivating further model development.”
– http://www.iac.ethz.ch/groups/schaer/research/reg_modeling_and_scenarios/clim_model_calibration
“The tuning of climate models in order to match observed climatologies is a common but often concealed technique. Even in physically based global and regional climate models, some degree of model tuning is usually necessary as model parameters are often poorly confined.”
– http://www.ipcc.ch/ipccreports/tar/wg1/371.htm models being tuned to other, more complex, models’ “observations”.
“Working Group I: The Scientific Basis”
“Having selected the value of F2x and T2x appropriate to a specific AOGCM, the simple model tuning process consists of matching the AOGCM net heat flux across the ocean surface by adjusting the simple model ocean parameters following Raper et al. (2001a), using the CMIP2 results analysed in Raper et al. (2001b).”
– http://www.climatescience.gov/Library/sap/sap3-2/final-report/sap3-2-final-report-appB.pdf More on the model being tuned to emulate models; aka models as “observational data”.
“For the TAR, these parameters were
tuned so that MAGICC was able to emulate results from a
range of Atmosphere-Ocean General Circulation Model
(AOGCMs) (Cubasch and Meehl, 2001; Raper et al.,
2001).”
– From the book “Climate change: an integrated perspective” by Willem Jozef Meine Martens and Jan Rotmans, pg 79-80
“In an effort to ensure that GCMs are as faithful as possible to the observations of the ocean and atmosphere, a process of calibration is required. As the models evolve, they are continually scrutinised in the light of observations to determin if significant physical processes are missing, and to refine those that are present. The latter may involve a reworking of the existing physical parameterisations or simply a new choice for the adjustable parameters that inevitably are part of these parameterisations. Changing the latter is also called ‘tuning’.”
– http://unfccc.int/resource/brazil/climate.html
“Parameters for tuning a simple climate model (plus aerosol forcing)”
– http://climateaudit.org/2007/12/01/tuning-gcms/ Analysis of Kiehl et al.
——————————————
And there are many more!
So Leif, I will need you to explain to me how this tuning presented from all these sources differ from the tuning under discussion by the paper here, or even defined by you. I frankly see no descriptive difference what so ever; it’s all the exact same discussion and methodology as far as I am reading -everywhere-.
Ged says:
November 3, 2011 at 11:55 am
It seems to me that the claims you are making and saying are different from tuning, are defined by you the same as what’s already been defined, ergo what you say isn’t different.
All your examples are general babble, but I have yet to see a single explicit example of a single parameter being changed to rectify a difference between model output and observations. There is a difference between a better empirical determination of a parameter and the parameter being changed because of the model being wrong. Please provide one.
Ged, many thanks. I’d like to highlight the first one of the citations, from the IPCC (emphasis mine):
I’d like to invite Leif to pay close attention to that. In general the “particular variables” start with the historical 20th century global mean surface air temperature. Parameter values are adjusted to optimize the simulation of historical values. This is called “tuning”, Leif.
The models all do a passable job of hindcasting the 20th century global mean surface air temperature. Leif seems to believe that, despite the various models using widely different forcings, and different model representations of the atmosphere and ocean, and different mathematical methods, they all converge on the historical record by the pure power of physics or something. I don’t know how he explains the paradox that despite using different forcings and methods. they arrive at the same answer. I say it’s from tuning. I’ve asked him that question. He did not answer.
Ged, you and I and the climate modelers know that they do it by tuning. This tuning takes place over the years of the development of the model. You make the first crude model. You pick some forcings. You see how it does on the 20th century hindcast. You change the forcings. You adjust a few parameters. You run it again. It does either better or worse. You make further changes based on that. Lather, rinse, repeat. Gradually, over the years, your model gets better and better at hindcasting the 20th century.
Whether Leif will pay the slightest attention to your large number of well chosen citations is another matter. I gave up trying to convince him, he seemed impervious to logic, references, prior scientific work, or discussion.
w.
“Leif Svalgaard says:
November 3, 2011 at 10:52 am
Gary Swift says:
November 3, 2011 at 10:38 am
the experts aren’t afraid to say things like “the evidence is inconclusive” or “there are several popular theories
It is interesting that there are no global climate models [that I know of] written by, run by, or otherwise pushed by skeptics.”
That is a straw man argument. Nice try, but FAIL. I am not going to bother chasing my own tail by responding to your illogical proposal that only someone who has their own climate model is allowed to have an informed opinion.
What I said about climate science is true, and it has nothing to do with models. I did not mention them.
As for my background, how do you know that I do not have adequate knowledge for an informed opinion? Climate scientists aren’t the only profession who deal with temperature, pressure, humidity, etc. on a daily basis. Yes, believe it or not, industry does have to deal with physics once in a while. How woulda thunk it? OMG!
Leif: “The assumption is that we [by working hard on this] can get better and better models and that they have predictive skill until proven otherwise. Every time a prediction fails we learn something new and can improve the models. This is how science [of any stripe] operates.”
The models should be treated as having “predictive skill until proven otherwise?” Let’s see . . . By this logic any Tom, Dick or Harry can put together a model and we must treat it as though it has predictive value until proven otherwise. Then when it fails after — oh, let’s see, how long do they claim we must wait, was it 17 years, or 30 years? — then Tom, Dick and Harry can make a few tweaks to the model and we must again *assume* that the “improved” model has predictive skill until another decade or two have passed, at which point they tweak it again . . .
Sorry, but you are not describing objective science [of any stripe]. You are describing an unhealthy blind belief in something that has never been demonstrated to have meaningful predictive skill. Maybe you run your science that way, but I think most of us will stick to the perfectly logical approach that unless, and until, Tom, Dick and Harry can demonstrate that their model has predictive value, there is no reason to think that it does, and absolutely no obligation to pretend that it does in obeisance to some broad notion of the advancement of science. And there is definitely no valid reason to pass laws and spend money based on a model that has not demonstrated meaningful predictive skill. That’s not science; it is insanity.
Leif : “There is a difference between a better empirical determination of a parameter and the parameter being changed because of the model being wrong. Please provide one.”
Is there? What is the difference Leif?
Willis Eschenbach says:
November 3, 2011 at 12:37 pm
In general the “particular variables” start with the historical 20th century global mean surface air temperature. Parameter values are adjusted to optimize the simulation of historical values. This is called “tuning”, Leif.
Provide a cite of a specific parameter that was tuned in that way.
TimTheToolMan says:
November 3, 2011 at 3:53 pm
Leif : “There is a difference between a better empirical determination of a parameter and the parameter being changed because of the model being wrong. Please provide one.”
Is there? What is the difference Leif?
An example: the Charnock relation. Here is how it is measured and parameterized [calibrated]:
http://powwow.risoe.dk/publ/Lange-POW'WOW-WorkshopPorto2007-WindWaveInteraction.pdf
The value used in atmospheric modeling for the parameter varies from 0.00001 [smooth sea] to 5 [tall building ] depending on the surface. Now it could be that the value also depends on the grid size of the model [or the time step or other model characteristic], so that a different set of values would work better. This might be explored by tuning the model using different values for the parameters [rather than the non-model empirical determination] to get a better fit to the observations. That is the difference between calibration and tuning.
Eric Anderson says:
November 3, 2011 at 3:29 pm
That’s not science; it is insanity.
The insanity is that people vote for politicians that exploit the ignorance of the people. Which one did you vote for?
Leif Svalgaard says:
November 3, 2011 at 4:10 pm
First, Leif, this is not my claim. It is a statement from the IPCC.
A specific example of a specific parameter? Sure. On June 11, 2010, programmer Harry was working on the GISS Model E when he noticed that … oh, wait, no, modelers don’t record when they tune each parameter. They just tune it.
I gave you an example already. Radiation balance. You said no, that wasn’t tuning … here’s the situation.
Their model doesn’t balance regarding TOA radiation. Two choices. Fix the underlying problem, or just tweak a parameter. They chose to tweak a parameter, Uoo, until it balances … and according to you, that is not tuning?
More to the point, how is that artificial balance even remotely satisfactory? You keep claiming the models are based on physics? That’s laughable. When you have to jerk a parameter around to get the model to do something as fundamental as balance, that is indisputable evidence that it is not based on physice.
So, I gave you another example. Albedo. No, that wasn’t tuning either, that was just matching the model to some observational parameters or something … but GAVIN SAID ALBEDO WAS TUNED.
But in any case, here’s another example. In the GISS Model E, they were having problems with the ice melt ponds. When ice melts from above and liquid water ponds on the surface, it changes the thermodynamic properties of the ice. But the model was getting melt ponds on the ice in months when they have historically never been seen.
So … how to fix it? Well, you could fix the underlying physics, or you could tweak parameters.
What did they do? Well … they arbitrarily limited melt pond formation to a couple months out of the year at each pole. Doesn’t matter that sometimes the ice ponds actually form in the other six months. Doesn’t matter that the physics are incorrect.
They just introduce a brand new parameter, and they adjust it so that the pond melt is a better match to the historical record.
In other words, this is another example of tuning. This is exactly the example you asked for, a specific adjustment of a specific parameter (months of melt pond formation) that was tuned to match observations.
w.
PS—Here’s the subroutine from the GISS Model E:
Note that the “pond melt season” is defined as the months of June and July … I wonder if anyone has notified the sea ice that the pond melt can only accumulate in June and July … this is a tuned routine, Leif. It has no physical basis. The months have been selected to match historical observations.
Willis Eschenbach says:
November 3, 2011 at 4:55 pm
It has no physical basis. The months have been selected to match historical observations.
The assumption that melt accumulates in the summer when the insolation is highest is a sound physical basis and that that makes for a better match… Or perhaps you disagree with that.
Their model doesn’t balance regarding radiation. Two choices. Fix the underlying problem, or just tweak a parameter. They chose to tweak a parameter
Which parameter?
I think you consistently confuse calibration with tuning.
Leif writes : “Now it could be that the value also depends on the grid size of the model [or the time step or other model characteristic], so that a different set of values would work better.”
Or it could be that the underlying physics doesn’t actually represent the reality sufficiently well and that the parameter which has the role of fudging for averaging changes in surface structure over a large area also fudges for a slightly misunderstood physical process because who would have thought that algae would have had a large effect on ocean roughness. />
Leif in my previous reply, my “Ficticous” tag was lost surrounding the comment on Algae…in case you think I’m being factual with that statement.
TimTheToolMan says:
November 3, 2011 at 5:32 pm
Or it could be that the underlying physics doesn’t actually represent the reality sufficiently well and that the parameter which has the role of fudging for averaging changes
Part of building a good model is to get the physics as correct as possible. This is hard and several tries or experimentation is needed [perhaps ongoing] to get a better representation. Sometimes you discover that you are missing a process, which you then have to add. There is a learning process here. Getting the physics right is not ‘fudging’, nor is it ‘tuning’, it is called ‘research’.
Leif Svalgaard says:
November 3, 2011 at 5:05 pm (Edit)
No, they picked the months of June and July, not when the insolation is highest.
The new parameter they just added, the “only at certain times” parameter for the ice melt ponds.
You think it’s all calibration? Fine. I’ll agree with that.
The title of the linked paper is “Our Calibrated Model has No Predictive Value”. Since we both agree that the climate models are calibrated, why do you you claim the paper does not apply to calibrated climate models just as it does to their own calibrated models.
(Now Leif will tell us that they are neither calibrated nor tuned, they’re “adjusted” or some other word … Leif, we’ve provided a stack of references to how climate models are calibrated.)
w.
PS—I’m still waiting for your explanation of how, if the model physics are so good, different models use very different forcings, and different physical representations of the ocean and atmosphere … but they can all hindcast the 20th century. I doubt that it’s coincidence, it can’t be the physics, and I don’t thing it’s luck, so I say it’s tuning … you say … well, so far you’ve avoided saying why they can all hindcast the past so well using totally different inputs. See Kiehl for the examples and details.
Still waiting for your answer on that one, Leif …
Willis Eschenbach says:
November 3, 2011 at 4:55 pm
the pond melt can only accumulate in June and July … this is a tuned routine, Leif. It has no physical basis. The months have been selected to match historical observations.
Here is the physical basis for the routine:
http://en.wikipedia.org/wiki/File:InsolationTopOfAtmosphere.png
The code is a crude approximation to the function given in the Figure. I wonder if you would also have said that the routine had no physical basis if the code was actually calculating the function and using that as a coefficient [suitably scaled].
Willis Eschenbach says:
November 3, 2011 at 5:57 pm
The new parameter they just added, the “only at certain times” parameter for the ice melt ponds.
See my comment on that.
Leif, we’ve provided a stack of references to how climate models are calibrated.)
As long as they are calibrated based on physics [like the melt periods] I have no problem. The calibrations can be good or bad. It is research to make them better.
if the model physics are so good, different models use very different forcings, and different physical representations of the ocean and atmosphere
Because this is a hard problem. Where we [might] differ is that I think the problem can be solved, eventually. I was just at a conference http://sdo3.lws-sdo-workshops.org/ where we were discussing how to model the solar atmosphere and interior and could only marvel at the enormous progress we have made the last ten years. I fully expect progress in climate modeling too.
Leif writes “Sometimes you discover that you are missing a process, which you then have to add. There is a learning process here. Getting the physics right is not ‘fudging’, nor is it ‘tuning’, it is called ‘research’.”
But meanwhile many aspects of climate models are known to be deficient and many others are no doubt not-yet-known to be deficient but are. This paper tells us that none of those models represent any form of reality with their predictions.
The upshot of trying to use models to validate the “CO2 dunnit” theory of GW is FAIL based on guaranteed incorrect results and circular reasoning.