A Modest Proposal—Forget About Tomorrow

Guest Post by Willis Eschenbach

There’s a lovely 2005 paper I hadn’t seen, put out by the Los Alamos National Laboratory entitled “Our Calibrated Model has No Predictive Value” (PDF).

Figure 1. The Tinkertoy Computer. It also has no predictive value.

The paper’s abstract says it much better than I could:

Abstract: It is often assumed that once a model has been calibrated to measurements then it will have some level of predictive capability, although this may be limited. If the model does not have predictive capability then the assumption is that the model needs to be improved in some way.

Using an example from the petroleum industry, we show that cases can exist where calibrated models have no predictive capability. This occurs even when there is no modelling error present. It is also shown that the introduction of a small modelling error can make it impossible to obtain any models with useful predictive capability.

We have been unable to find ways of identifying which calibrated models will have some predictive capacity and those which will not.

There are three results in there, one expected and two unexpected.

The expected result is that models that are “tuned” or “calibrated” to an existing dataset may very well have no predictive capability. On the face of it this is obvious—if we could tune a model that simply then someone would be predicting the stock market or next month’s weather with good accuracy.

The next result was totally unexpected. The model may have no predictive capability despite being a perfect model. The model may represent the physics of the situation perfectly and exactly in each and every relevant detail. But if that perfect  model is tuned to a dataset, even a perfect dataset, it may have no predictive capability at all.

The third unexpected result was the effect of error. The authors found that if there are even small modeling errors, it may not be possible to find any model with useful predictive capability.

To paraphrase, even if a tuned (“calibrated”) model is perfect about the physics, it may not have predictive capabilities. And if there is even a little error in the model, good luck finding anything useful.

This was a very clean experiment. There were only three tunable parameters. So it looks like John Von Neumann was right, you can fit an elephant with three parameters, and with four parameters, make him wiggle his trunk.

I leave it to the reader to consider what this means about the various climate models’ ability to simulate the future evolution of the climate, as they definitely are tuned or as the study authors call them “calibrated” models, and they definitely have more than three tunable parameters.

In this regard, a modest proposal. Could climate scientists please just stop predicting stuff for maybe say one year? In no other field of scientific endeavor is every finding surrounded by predictions that this “could” or “might” or “possibly” or “perhaps” will lead to something catastrophic in ten or thirty or a hundred years. Could I ask that for one short year, that climate scientists actually study the various climate phenomena, rather than try to forecast their future changes? We still are a long ways from understanding the climate, so could we just study the present and past climate, and leave the future alone for one year?

We have no practical reason to believe that the current crop of climate models have predictive capability. For example, none of them predicted the current 15-year or so hiatus in the warming. And as this paper shows, there is certainly no theoretical reason to think they have predictive capability.

The models, including climate models, can sometimes illustrate or provide useful information about climate. Could we use them for that for a while? Could we use them to try to understand the climate, rather than to predict the climate?

And 100 and 500 year forecasts? I don’t care if you do call them “scenarios” or whatever the current politically correct term is. Predicting anything 500 years out is a joke. Those, you could stop forever with no loss at all

I would think that after the unbroken string of totally incorrect prognostications from Paul Ehrlich and John Holdren and James Hansen and other failed serial doomcasters, the alarmists would welcome such a hiatus from having to dream up the newer, better future catastrophe. I mean, it must get tiring for them, seeing their predictions of Thermageddon™ blown out of the water by ugly reality, time after time, without interruption. I think they’d welcome a year where they could forget about tomorrow.

Regards to all,

w.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

299 Comments
Inline Feedbacks
View all comments
Ged
November 4, 2011 12:23 pm

@Leif,
You make great points, but I still do not see how any of this is different from the definition of tuning as put forth in the head article, and as used in each and every one of those sources. Such as “tuning is carried out through alterations of parameter values in these physics descriptions” and “…and hence there are numerous ways to tune the model to a chosen level of radiative balance.”. That’s the definition of tuning I’m reading here and everywhere, yet somehow you are claiming that isn’t tuning, or isn’t what’s being discussed? What is tuning then by your definition? This is the disconnect I don’t understand in your reasoning, nor does Willis from what I’m inferring from his responses.
About using random values to find a good fit, again we see from just that one paper (as is also used in all the others): “We modify a number of parameters that are commonly used for tuning (Hack et al 2006), including relative humidity thresholds for cloud formation, thresholds for autoconversion of liquid and ice to rain and snow, efficiency of autoconversion in convective and stratiform clouds, efficiency of precipitation evaporation and adjustment timescales associated with convection… there is presently no way to determine which is more correct.”
So, they played with the values, “tuning” them to see if the model would best fit the observational data between either data set, and which tuned parameters would be most effective, yet it came out as a wash, with no way to determine which was more correct. This is in agreement with the head post’s paper about such tuning/calibration not being effective for models. And yet, we have from this paper another reference to a previous paper about tuning. CGM are tuned, as all my references stated, including state-of-the-art ones, and so fall under the topic of discussion and are not exempt as you wish to make them. The word tuning is used all the same, in the same scientific context, etc.
This is why I cannot fathom your reasoning: it flies in the face of what’s clearly written and described.
The whole idea is that calibrating a model with physics or observation does not give it predictive power, that’s what the original paper seems to be saying. And here I’ve shown you many times calibration/tuning are used on models with particular parameters listed. And in the end the conclusion of that particular paper we’re discussing, when looking at various tuning parameters was that: ” there is presently no way to determine which is more correct.”
So, unless completely new methods and data I’m not familiar with has come up, all my reading comes to the same conclusion:
1. GCMs are tuned, as they themselves report, as the IPCC reports, as the methodology reports. Using that exact word, “tuned” or “tuning” or “calibration”. All uses are seen describing the same method: changing parameters iteratively to best match the behavior of the nascent model to observations. The absolute range for the parameters being tuned come from physics, but their values are subject to revision against observation.
2. This tuning is the tuning under discussion by this post and the paper in the head post.
3. Tuning is not seen to increase the predictive value of models, in agreement with the head post’s paper, which similarly stated that calibrated, tuned models had no predictive power for their investigation.
This does not mean there is no predictive power to weather models, as there is at least a “best guess” effect. But none the less, the conclusions of the head post paper ARE applicable to the GCMs, and you have shown no evidence to the contrary, and I and others have provided ample references using and describing ‘tuning’, which stand contrary to your arguments.
Again, how is tuning of GCMs listed in the references different from tuning as listed by the head post paper? I see absolutely no scientific difference in the definitions or applications or conclusions. Again, maybe I’m missing it, or maybe I’m outdated, but so far you still have not shown the difference between tuning listed in my references and others, and tuning listed in the head post, nor the difference in the conclusions or the exclusivity of GCMs from the conclusion.

November 4, 2011 12:59 pm

Ged says:
November 4, 2011 at 12:23 pm
You make great points, but I still do not see how any of this is different from the definition of tuning as put forth in the head article, and as used in each and every one of those sources.
The crucial difference is one of physical understanding. Let us take the case of melt ponds. Suppose that altering the rule to ‘if month = October or month = January or month = July then …’ actually did improve the fit for historical data, then that would be ‘tuning’ to match the model to the observations. But would IMO not presage improved predictability for the future because the tuning did not have a reasonable physical basis. If, on the other hand, the actual insolation was computed from astronomical data and that improved the fit, then I would say that a better physical calibration improved the model. I hope that this example clarifies the difference. Of course, for rabid contrarians, nothing I say will clarify anything. I trust that you do not belong to that category.

Ged
November 4, 2011 1:15 pm

@Leif,
Please also notice I’m not making any claims about the veracity of the results from the head paper, nor trying to interpret their meaning.
I should also make sure we’re on the same page about what “predictive power” means, which from my understanding is that “as a value changes in the future in a way we expect it will, we predict other values will change in an x or y way in response”. So while “tuning” changes values to match observations of the past, “prediction” changes expected values and reads out what the “observations” should look like in the future.
There is no doubt tuning is done for GCMs as far as I see and read, again I make this point so we can get off that hang up and onto the real discussion about tuning verses predictive power.
Can we make models which do have some level of predictive power, without using tuning/calibration (which come from physics and observation in the first place)? Maybe; and a complete understanding of physics should mean no tuning would ever be required, the model would work for dependent values the same forwards and backwards as long as independent values changed as they were expected to. We don’t have a complete understanding of physics, so how do we gauge “predictive power” and assign a quantitative value to a model’s ability to predict the future? I can think of ways, but it’s all a crud shoot.
If tuning/calibration does not increase predictive power of a model, what does that imply and how do we move forward to make useful, predictive models? And to what degree of “predictive power” do we shoot for? What degree do we say is enough? What level do we need before we decide we can place confidence enough on a predictive series to try to modify if wanted the expected change in the independent value to avoid changes in corresponding dependent values which the model is predicting will be effected by x or y if we do nothing?
I don’t have the answers to those questions, and have yet to see convincing answers to those questions. The paper in the head post challenges the usefulness of models, or at least the tuning/calibration based on physics methods for making “predictive” models. But quantifying this issue is beyond me at the moment. And this effects all fields, not just GCMs.
Look at the semiconductor industry: models there are never right, though they are used as a starting place, they never predict a chip’s performance accurately; and surely we know the physics there better than the climate, and surely businesses have more invested in the matter and a need for more accuracy than anyone with GCMs. Many months are spent revising chips before they are brought to market, and some still fail spectacularly despite model predictions (i.e. AMD Bulldozer chip). In all fields, I have never seen a model with much predictive power, only as something that could guide experimentation as a hypothesis that “might be”, and thus is testable.
So, I can’t tell you the ultimate conclusions to all this, other than what we can plainly observe everywhere, and what this head paper points out clearly. You can make arguments about its results and the interpretations to take from them, and how those might apply to GCMs; and I would be very interested in that.

Ged
November 4, 2011 1:23 pm

@Leif,
“If, on the other hand, the actual insolation was computed from astronomical data and that improved the fit, then I would say that a better physical calibration improved the model.”
I think here lies the heart of what you’re trying to get at, and differentiate? From what I’m reading, of the head post and all the sources, what you say there -is tuning-. That is what tuning is, right there. You took observed astronomical data, calculated new features to improve the fit to other observed historical melt pond data. Now, if astronomical data was never included in the first place, then it wouldn’t be tuning, I would think, as you added a new feature/constraint. But if astronomical data was already in there, and you just updated the calculated value to increase fit with the melt data, that would clearly be tuning by every definition I’m seeing. That is my understanding.
If the model correctly had the physics and equations for melt ponds, it would need -no- astronomical data or calculations from it, -no- outside observations from any sort would be needed: the pure math itself would deterministically decide the melt pond values, and that would work when going backwards (independent values fed in from historical data where melt pond data is dependent), and then would work going forwards. Having to modify “insolation” from astronomical data, is tuning, and the melt pond data is no longer a self contained constraint.
And indeed, as put forth in my references and elsewhere, that is what GCMs do, clearly, and thus it is fair to say they are tuned in the same regard as “tuning” is discussed and used in the head post paper.
That is my logic and understanding at the moment, and I -could always be wrong/off- from what you’re getting at. I do not yet see the distinction you are attempting to draw, it’s all the same as far as I am seeing.

November 4, 2011 6:21 pm

Ged says:
November 4, 2011 at 1:23 pm
That is what tuning is, right there. You took observed astronomical data, calculated new features to improve the fit to other observed historical melt pond data.
No, that is not tuning. That is calibration because I use a physical process and reason for the change. If I just varied the parameter at random until I found the best fit, that would be tuning [of – in fact – the parameter]. For the calibration I expect improved predictive capability, for the tuning I cannot have such expectation [one can always hope, but that is not the same]. Put differently: curve fitting [tuning] does not improve predictions. Calibration with better physics does.

November 4, 2011 9:20 pm

Ged says:
November 4, 2011 at 1:23 pm
That is what tuning is, right there.
To put it as succinctly as possible:
Mere tuning does not improve the predictive power. Improving the parameters [or adding more] based on better measurements and physics-based calibrations will improve the predictive power. Same with improvements in both spatial and temporal resolution.

Richard S Courtney
November 5, 2011 2:30 am

At November 4, 2011 at 7:33 am you say;
“Richard S Courtney says:
November 4, 2011 at 1:10 am
“But I write to inform you that they provide discredit of you by impartial observers.”
“At least I am civil and provide comments that explain my position rather than just parroting others.”
Say what!?
Even in this case you imply that I do not “explain my position” and “parrot others” which are both falsehoods. That is not “civil” and you attempts to “explain” your “position” have been a total failure in this thread.
Richard

November 5, 2011 4:34 am

Leif writes “Improving the parameters [or adding more] based on better measurements and physics-based calibrations will improve the predictive power.”
In a classic three body problem if you know precisely the velocitiesand masses of the bodies then you can predict their positions in the future. This is a perfect model with perfect data. However if you dont know two of the body’s velocities precisely then you can no longer predict their positions in the future at all (except perhaps in the very very near future where you have an approximation)
Now imagine if you learn precisely one of those unknown bodys’ velocities thus leaving only one unknown body. Are you any better off? No. You still cant predict the future for the bodies.
Such is it with GCMs. Improvements are meaningless as far as their ability to predict the future goes. They may be able to predict it better in the very very short term but essentially they’re as useless for prediction as they ever were.

November 5, 2011 5:23 am

Richard S Courtney says:
November 5, 2011 at 2:30 am
Even in this case you imply that I do not “explain my position” and “parrot others” which are both falsehoods. That is not “civil” and you attempts to “explain” your “position” have been a total failure in this thread.
I’ll grant that there are people who simply don’t get it.

November 5, 2011 6:01 am

Leif writes “I’ll grant that there are people who simply don’t get it.”
You believe that despite the imperfect processes within the models, because they are constrained by physics they are still modelling what happens when a pertubation happens.
I believe that those imperfectly modelled processes invalidate that effect of that pertubation and so cannot predict what its effect is. The failure of the models to model those processes happens on many levels. The model in this article demonstrates that very well.
Some of us understand this a lot better than you think Leif. We dont have dogs in this race, though, so we can perhaps be a little more objective about it.
Incidentally you appear to be rejecting this notion on irrational grounds. Do you believe the model in this article isn’t based on well understood physics? I mean it appears to be based on a commercial oil exploration and production model afterall. I expect its actually considerably better than most if not all of the GCMs.

November 5, 2011 6:26 am

Actually I quite like my three body analogy and can push it a little further.
You believe that the forcing of CO2 is akin to giving the 3 body system a bit of a push in one direction as a whole. You (may) accept you cant model where any of the 3 bodies are precisely if you dont know the facts about them precisely but you can model where they have moved laterally in a general sense.
So something like… body “a” is constrained to be within such and such a diameter and x kms North of the starting point because we knew how hard we pushed the system as a whole.
This is wrong however because CO2 is not a true forcing. If the sun had increased its output by a few Watts then yes, the above would be a valid analogy but the change is “internal” and so therefore its not that the system gets to move laterally, rather its bodies instead are pushed onto different courses.
This, I believe, is a fundamental misunderstanding by people who believe we may not be able to measure weather but climate change from CO2 is perfectly achievable.
There is scope for atmospheric/oceanic processes to change such that they transport energy more quickly upwards to decrease the temperature gradient. There are lots of possibilities. But there is no guaranteed effect. Not even due to the relatively well known radiative properties of the GHGs because there are so many other important processes that will respond to the changes and will naturally try to minimise the temperature gradients.

November 5, 2011 6:29 am

Oh and above I said “measure the weather” but what I meant was “model/predict the weather”

November 5, 2011 6:32 am

TimTheToolMan says:
November 5, 2011 at 6:01 am
Do you believe the model in this article isn’t based on well understood physics? I mean it appears to be based on a commercial oil exploration and production model afterall. I expect its actually considerably better than most if not all of the GCMs.
‘appears’, ‘expect’…
The model in the paper is just curve fitting and as such does not the predictive powers that solution of the governing differential equations does. This should not be difficult to understand and most people here seem to have grasped that [with a few notable exceptions]. If you expect that curve fitting is considerably better than solving the equations then there is little hope.

November 5, 2011 6:38 am

TimTheToolMan says:
November 5, 2011 at 6:26 am
This is wrong however because CO2 is not a true forcing.
The models do not assume that. They simply calculate the effects of the system as a whole. You give yourself away when you offer “due to the relatively well known radiative properties of the GHGs”. Those properties are not ‘relatively well known’. They are ‘extremely well known’ as they are measured precisely in the laboratory.

November 5, 2011 6:52 am

Leif writes “The model in the paper is just curve fitting and as such does not the predictive powers”
Well maybe. They dont mention what model they’ve used specifically or whether it is as you suggest a very simple application of the three parameters mentioned. However its clear its more complicated that a simple equation because they describe the geological feature modelled and its non-trivial.
“The models do not assume that.”
They dont need to and I never said they did. Thats missing or ignoring the point entirely.
I guess we’re going to disagree here too. The radiative properties in our atmosphere are only relatively well known. CO2 isn’t evenly distributed throughout the atmosphere and there are attempts to measure its distribution. It varies by season too. Water vapour likewise varies considerably. There is a world of difference between what is measured in the lab and what happens in the real world.

November 5, 2011 7:42 am

TimTheToolMan says:
November 5, 2011 at 6:52 am
They dont mention what model they’ve used specifically
And you don’t consider that an important issue? There are many models in use in the petroleum industry [e.g. TDAS] and they work very well. In general, there are two problems: 1) to simulate the flow [and that is done numerically by standard simulation tools and is not the issue] and 2) [the harder of the two] to specify the shape and properties of the reservoirs. The paper in question is concerned with this second problem and the errors introduced by errors in the properties of the reservoir. This is very different from the climate models.
The radiative properties in our atmosphere are only relatively well known. CO2 isn’t evenly distributed throughout the atmosphere and there are attempts to measure its distribution. It varies by season too. Water vapour likewise varies considerably.
The climate models are directly concerned with the variability of the distribution and seasons [and those is not ‘radiative’ properties]. You might find it illuminating to actually read about what the atmospheric models do: http://www.leif.org/EOS/CAM3-Climate-Model.pdf especially section 4.8 and 4.9
There is a world of difference between what is measured in the lab and what happens in the real world.
And you assume that climate modelers are absolute morons that do not know this. The models are concerned with applying the lab data to the real world in the best way possible.

Rational Debate
November 5, 2011 1:56 pm

Leif, back to the melt accumulation for a moment. The modelers introduced the two month limit, in order to force the model to more closely reflect what is actually seen than it had been before they introduced the 2 month limit. If they were adjusting based on physics, why didn’t they adjust the algorithms, differential equations etc., associated with calculating insolation instead? e.g., the actual model representations of the physics involved?

November 5, 2011 3:00 pm

Rational Debate says:
November 5, 2011 at 1:56 pm
why didn’t they adjust the algorithms, differential equations etc., associated with calculating insolation instead? e.g., the actual model representations of the physics involved?
Because you parameterize to the extent you think it is important. The melt ponds are not a vital element in the model so an approximation is [or was judged to be] good enough. I do not see any evidence that that better approximation [than the previous 6-month function] was introduced after a specific tuning run was made just to test the fit for the melt ponds. If you can find such evidence I would be glad to look at it. The dynamic parts of the model [the differential equations] would not be affected anyway by a better approximation of the melt pond function.

Rational Debate
November 5, 2011 4:19 pm

re: Leif Svalgaard says: November 4, 2011 at 7:33 am
Working thru the various pieces:

Rational Debate says: November 4, 2011 at 2:42 am
The assumption must be that the model does not have predictive skill until proven otherwise.

People that build the models make a great effort to do the best job possible.

Yes, I believe that to be generally and in the majority of cases to be true. I also believe that they are humans and so subject to all the foibles of human nature, including those that are made all the time by the most well meaning, hard working, good people. Things such as accidentally overlooking or incorporating errors, confirmation bias, and all of the unintentional problems that can arise even with the most diligent and well intentioned person.
I have to think that the very complexity of climate models works against the modelers in this regard – the more complicated and longer the code, the more incredibly difficult it becomes to root out errors, make any major modifications (rather than just trying to tweak what you’ve got), and so on.
Then there are those who are just not as diligent, or not as dedicated, or just downright lazy. Worse, human nature, unfortunately, also includes what I believe to be a far smaller number who are either openly or under cover working to suit an agenda (other than that of conducting good science), or for whatever reasons actively attempt to delude others or even themselves in the pursuit of money, power, prestige, promotions, tenure, etc., etc. Heck, even just to cover up some previous error they made because their ego can’t tolerate it coming to light.

Skill can be measured as a ‘skill score’. One definition involves the mean squared error MSE = sum ([prediction(i)-observation(i)]^2)/N. Then the skill score is SS = 1 – MSE(prediction)/MSE(climatology). A perfect prediction has a SS of 1.0. A prediction that is no better than just averaged climatology has a SS of 0, while a prediction that is worse than climatology has negative SS. The absolute value of the SS will usually decrease as the time interval covered by the prediction increases…

Leif, thank you very much for the description and the links. It will probably take me little while to get to and work thru them, but I am quite interested. In the meantime, if you don’t mind, I’m wondering what “just averaged climatology” means? That if your projecting 5 years, the model output is within the bounds, high and low, actually observed during that time period, where a perfect prediction would be to exactly match observations throughout the entire time period? Or?
You had also said that the current models have a positive predictive ability – I’m assuming this is all done by using some chosen historical starting point for initial conditions, and then seeing how well the model predicts climate from there, such that the predictions can be compared to actual data, and that this is what is used to determine the model’s skill score, right? Also, the current model positive predictive ability – that’s over what time frame?
I’m thrown on this whole subject over the last IPCC model projections of temperature increases anywhere between, what, something like 2 to 6 degrees? When we’ve only seen approx a degree in the 20th century… e.g., how can a positive skill score be found when the model range output is 1) so very wide and 2) two to six times higher than the dataset available to calculate the skill score?

… It is also to be expected that when people put solve the equations governing the evolution of the climate that there will be some skill. This expectation is normally fulfilled in all other scientific endeavors. That is what science is: the ability to predict something from ‘laws’ that have been deduced from observations.

The majority of experiments aren’t positive – the null hypothesis remains. Generally the null hypothesis is there for very good reason – loads of past research and evidence suggesting that it’s almost certainly correct. The axiom of science is, if at first you don’t succeed, try, try, again. In other words, the expectation isn’t success, it’s failure. Followed by revision, and repeated attempts – or if appropriate (and it often is) based on the experimental results, abandoning the hypothesis as entirely incorrect and unable to be reformulated in any way worth pursuing, and moving on to something else.
The researcher who expects positive results is begging to run afoul of confirmation bias. Hoping your results are positive is unavoidable – we’re humans and there’s nothing wrong with that. Expecting that your experimental design is sound enough to be conclusive one way or the other wrt to the validity of your hypothesis, now that’s reasonable and typical – so long as when obtaining positive results you also expect to completely open your entire process up for other people to try their darnedest to poke holes in your experimental design, methods, results, and conclusions. Assuming or expecting positive results is not scientific and is dangerous in terms of radically increases the chances of producing bad results, particularly over time.

As an aside – sure, it’s insane to vote for politicians that exploit the ignorance of the people – and when the only candidates fit into this category, then which do you vote for

A people has the government they deserve.

Do they? Easy to say, but life is a bit more complicated than that. Plus, it’s a reply which doesn’t answer my question – when the only candidate options you have are both equally badly flawed, which do you vote for? Not to mention the issue of whether each of us individually deserves the government we wind up with.
As to the comment you did make, let’s imagine that space aliens descend on our planet with vastly superior technology and/or firepower. It becomes clear that resistance means 3/4’s or more of us will be killed in short order. Result, they take us over, and implement a government that we all abhor. Do we then deserve that government?
Closer to home – a warlord with money, automatic weapons, artillery, planes, bombs, maybe even chemical or biological weapons, etc., moves into an absolutely destitute area where the population is narrowly avoiding starvation and has no weapons beyond sticks, self made bows & arrows & spears. Do they deserve the government they wind up with?

it makes no sense to assume a model has predictive ability before it has been soundly proven to actually make accurate predictions.

See comment to Richard. Some skill is better than no skill. If I can predict the stock market with 51% accuracy, I’ll come out ahead in the long run.

Only if there is an investment vehicle that actually accurately matches the overall market that’s included in your model. And in following those predictions, you don’t wind up with more than the 1% being eaten up in the commissions involved in purchasing the stocks, and the tax losses each time you sell in order to readjust and follow your predictions. And you can afford to sink that much capital into the market and let it remain there for the duration necessary to recoup that 1%. And there aren’t other investments available that average a greater return in a shorter time frame. And that 51% accuracy is over a short enough time frame to accumulate “enough” for you to consider the time, effort, etc., all worth it such that you’re “ahead” before you die.
As to some skill being better than none when it comes to global climate models – that all depends on what is done with the results. Certainly it’s better from the perspective of any scientist or modeler wanting to improve on previous efforts. Also from the standpoint of being that much closer to a model that might be accurate enough to actually use more widely than just in the lab. Some skill, however, is not at all necessarily better than none if it is presented as being more meaningful than it really is, or if it is used by governments to re-order society or implement “remedies” where the cure is worse than the possible disease. 51% accuracy if used to spend trillions of dollars and lower the standard of living for millions or even billions, because it’ll likely warm 2 degrees in 100 years would do unspeakably grievous damage if during those 100 years there were a period of a few decades with temps far lower than present – even if by year 100 it was 2 degrees warmer than the present.
Not to mention that what may appear to be a skill score of 0.5 today, say, could easily to turn out to actually have been a negative score if naturally occurring cycles with a longer cycle than those included in the model exist (they almost certainly do, we don’t have a long enough and robust enough historical instrumental data set to know), or the skill score calculated happened to coincide with the observed data used in the calculation in part by chance rather than because of model accuracy, etc.

Rational Debate says: November 4, 2011 at 3:03 am …but the only way it’s a sound physical basis is if it matches reality. […] But it’s still a manually imposed tweak that isn’t an accurate programming representation of the actual physics involved in melt accumulation.

It is an approximation [compared to using the actual insolation] that is based on sound physics. The model builders seem to have concluded that is is good enough for their purpose. Perhaps in later version they will use a better approximation, especially if is turns out that this parameter is important [which I doubt].

Yes, but it isn’t only about melt, is it? I mean, insolation affects more than just melt – and if it’s calculation is wrong based on the melt results it produces, then that physics representation is likely propagating errors through the rest of the model in other ways also. Or perhaps isolation isn’t the problem, but other unaccounted for factors such as soot are – again, which would mean that there are errors propagating thru the model. The entire point wasn’t about how significant melt is or isn’t, but rather about how well and why the models – and adjustment methods used on those models – wind up representing reality and if there is any reason to have confidence in their predictive abilities.

Rational Debate says: November 4, 2011 at 3:10 am
but unless you have a time machine to actually go and see the final results, there is simply no way to know what the solar luminosity will be

Prediction is not about ‘knowing’. but about having a skill score that is high enough to take seriously, e.g. 0.99999999999999999 or even 0.9. We have great confidence in our prediction of the Sun’s luminosity because we observe millions of stars at all ages and at all phases of their evolution and can directly verify that they behave as predicted.

At best you can directly verify that over the past x years (25?, 50?, 100? – I’m sure both then number of stars & quality of observations has exploded recently), the stars that have been observed closely enough have behaved as you currently expect them to have behaved. That’s nothing relative to billions of years. There’s always that black swan.

A very large asteroid impact could certainly completely change the future of the Earth

Again, prediction is about being good enough, not perfect. Your argument is of the kind that it does not make sense to lay up supplies for the coming winter, because we may all be wiped out by an asteroid anyway. Not exactly a ‘Rational Debate’.

No, that wouldn’t be a ‘rational debate.’ That’s also not a rational argument on your part. We’ve essentially got reams of data every year going back centuries, and generational knowledge throughout the history of the existence of man that every year winter comes and to survive it’s necessary to have access to supplies. Significant asteroid strikes that affect many people, on the other hand, are extremely rare (thank gawd Tunguska didn’t happen over Moscow or NYC instead!). On the other hand, our actual instrumental data & experience with the behavior of Earth’s sun boiling off our atmosphere? Our instrumental data & experience with any significant variation of solar output over billions of years? Ya, there is no comparison – it’s not a rational analogy to have put forward as if it in any way represented what I actually said.

Rational Debate
November 5, 2011 6:12 pm

re: Ged says: November 4, 2011 at 1:15 pm

….Look at the semiconductor industry: models there are never right, though they are used as a starting place, they never predict a chip’s performance accurately; and surely we know the physics there better than the climate, and surely businesses have more invested in the matter and a need for more accuracy than anyone with GCMs. Many months are spent revising chips before they are brought to market, and some still fail spectacularly despite model predictions (i.e. AMD Bulldozer chip). In all fields, I have never seen a model with much predictive power, only as something that could guide experimentation as a hypothesis that “might be”, and thus is testable.

As best I know, that’s pretty much true in all fields. Aerospace is another prime example. The physics is far less complex than climate, and we’ve been at it far longer – yet they start with models then test, revise, test, revise, etc., and even so initial flight tests quite often surprise with actual performance significantly different than the predicted and wind tunnel tested, etc. versions, let alone what was predicted from the computer model only.
Higgs Boson is having fun with physics models. :0)

Rational Debate
November 5, 2011 6:13 pm

Apologies for the length of my November 5, 2011 at 4:19 pm post – I should have broken it into pieces I guess.

Rational Debate
November 5, 2011 6:21 pm

re: Leif Svalgaard says: November 4, 2011 at 7:33 am

….You might want to compare your ‘fixed’ version “perfectly valid incomplete physics’ to Willis’s “have no physical basis”, and note that we have made progress via our discussion. I take that as a positive sign.

It’s all relative. A model can easily contain a number of perfectly valid physics components, or even incomplete ones :0) , and still have no physical basis overall. Or what one person is willing to shade grey as perfectly valid incomplete physics, another might be more inclined to black and white and find the degree of incompleteness sufficient to categorize the thing as “no physical basis.”
Ah, the joys of the language and communication (espec. with the added difficulty of time constraints & written word only, sans vocal tone, expressions, and body language). 😉

November 5, 2011 7:18 pm

Rational Debate says:
November 5, 2011 at 4:19 pm
so subject to all the foibles of human nature
Science is very competitive and errors are ferreted out with glee.
the more complicated and longer the code, the more incredibly difficult it becomes to root out errors, make any major modifications (rather than just trying to tweak what you’ve got), and so on.
We have learned how to do this. It is called being modular and separate the various contributions. If you look at this model, you’ll see how well it is done: http://www.leif.org/EOS/CAM3-Climate-Model.pdf
under cover working to suit an agenda
I know many of the people involved and they just don’t do that. Also, the code is public and comes with a User’s Manual so you can try it out yourself. It would be too damaging to try to cheat with this.
“just averaged climatology” means?
Just the average values over a reference period [say 30 years]. The skill score will always be defined no matter what the climatology is and you just sum up the squares of the deviations from average. The skill score will always be less than 1.
what is used to determine the model’s skill score, right? Also, the current model positive predictive ability – that’s over what time frame?
There are many models and the errors in some tend to be cancelled by opposite errors in others. There is a literature on that. You can google it as well as I can.
I’m thrown on this whole subject over the last IPCC model projections of temperature increases anywhere between, what, something like 2 to 6 degrees?
I think the deviations from the record is much smaller. Here is a paper on that: http://www.leif.org/EOS/2007_Hansen_climate.pdf [Hansen is not the only author 🙂 ]
The axiom of science is, if at first you don’t succeed, try, try, again. In other words, the expectation isn’t success, it’s failure.
This may be so for discovering something new, but is certainly not the case when applying the laws we already know. For those we expect success, every time. It is really more like engineering: from the known properties of materials and mechanical laws we calculate how thick a beam must be to carry a given law, and we certainly expect that calculation to be correct.
you also expect to completely open your entire process up for other people to try their darnedest to poke holes in your experimental design, methods, results, and conclusions.
Every scientist worth her salt does that.
when the only candidate options you have are both equally badly flawed, which do you vote for?
You take to the streets. Occupy GISS! :-[)
Do we then deserve that government?
Might is right, especially ‘people power’
moves into an absolutely destitute area where the population is narrowly avoiding starvation and has no weapons beyond sticks, self made bows & arrows & spears. Do they deserve the government they wind up with?
That is the way of the world.
And that 51% accuracy is over a short enough time frame to accumulate “enough” for you to consider the time, effort, etc., all worth it such that you’re “ahead” before you die.
If you don’t like 51% make it 60%.
because it’ll likely warm 2 degrees in 100 years would do unspeakably grievous damage
I don’t think there will be any damage as people simply won’t go for it in the long run. And if they did, they deserve what they get. You seems to assume that Governments are competent is carrying out grandiose plans over many years. They are not.
what may appear to be a skill score of 0.5 today, say, could easily to turn out to actually have been a negative score
Again, you are assuming that the Governments can pull off those damaging plans. I’m much less sanguine about that.
if it’s calculation is wrong based on the melt results it produces, then that physics representation is likely propagating errors through the rest of the model in other ways also.
Calculations are not ‘wrong’ [computers can add]. Part of building the model is to get the code right for carrying forward the stuff calculated at earlier steps.
why the models – and adjustment methods used on those models – wind up representing reality and if there is any reason to have confidence in their predictive abilities.
We have some confidence in the models based on how well they represent the past.
At best you can directly verify that over the past x years (25?, 50?, 100? – I’m sure both then number of stars & quality of observations has exploded recently), the stars that have been observed closely enough have behaved as you currently expect them to have behaved.
No, there are stars with the same mass and composition as the Sun, but billions of years older. We observe those to behave as predicted.
Our instrumental data & experience with any significant variation of solar output over billions of years?
See just above.

November 5, 2011 7:27 pm

from the known properties of materials and mechanical laws we calculate how thick a beam must be to carry a given load, and we certainly expect that calculation to be correct.

November 5, 2011 8:35 pm

Rational Debate says:
November 5, 2011 at 6:21 pm
A model can easily contain a number of perfectly valid physics components, or even incomplete ones :0) , and still have no physical basis overall.
Works the other way too: a model with good physical basis overall could still have corners where the physics is shaky or even wrong, but those may not be fatal to the performance.
Rational Debate says:
November 5, 2011 at 6:12 pm
Higgs Boson is having fun with physics models. :0)
Actually the LHC is run by models. The detectors are modeled in great detail so we can interpret the data. Without models that accurately describe the reality of the equipment, the raw data would make no sense at all.
The Higgs is predicted by models of the interactions. Those are fully expected to have great predictive power, if not the whole thing makes no sense. If the boson is not found, we know that the models [and more importantly the underlying physics] were wrong and that is important too, so we can look elsewhere. I don’t think anybody would claim that if the boson is found just as predicted that since the models have no predictive power the discovery of the boson is pure coincidence. Actually from the discussions here I should modify that: there are probably people who would say just that :-[)
.