The “ensemble” of models is completely meaningless, statistically

This  comment from rgbatduke, who is Robert G. Brown at the Duke University Physics Department on the No significant warming for 17 years 4 months thread. It has gained quite a bit of attention because it speaks clearly to truth. So that all readers can benefit, I’m elevating it to a full post

rgbatduke says:

June 13, 2013 at 7:20 am

Saying that we need to wait for a certain interval in order to conclude that “the models are wrong” is dangerous and incorrect for two reasons. First — and this is a point that is stunningly ignored — there are a lot of different models out there, all supposedly built on top of physics, and yet no two of them give anywhere near the same results!

This is reflected in the graphs Monckton publishes above, where the AR5 trend line is the average over all of these models and in spite of the number of contributors the variance of the models is huge. It is also clearly evident if one publishes a “spaghetti graph” of the individual model projections (as Roy Spencer recently did in another thread) — it looks like the frayed end of a rope, not like a coherent spread around some physics supported result.

Note the implicit swindle in this graph — by forming a mean and standard deviation over model projections and then using the mean as a “most likely” projection and the variance as representative of the range of the error, one is treating the differences between the models as if they are uncorrelated random variates causing >deviation around a true mean!.

Say what?

This is such a horrendous abuse of statistics that it is difficult to know how to begin to address it. One simply wishes to bitch-slap whoever it was that assembled the graph and ensure that they never work or publish in the field of science or statistics ever again. One cannot generate an ensemble of independent and identically distributed models that have different code. One might, possibly, generate a single model that generates an ensemble of predictions by using uniform deviates (random numbers) to seed

“noise” (representing uncertainty) in the inputs.

What I’m trying to say is that the variance and mean of the “ensemble” of models is completely meaningless, statistically because the inputs do not possess the most basic properties required for a meaningful interpretation. They are not independent, their differences are not based on a random distribution of errors, there is no reason whatsoever to believe that the errors or differences are unbiased (given that the only way humans can generate unbiased anything is through the use of e.g. dice or other objectively random instruments).

So why buy into this nonsense by doing linear fits to a function — global temperature — that has never in its entire history been linear, although of course it has always been approximately smooth so one can always do a Taylor series expansion in some sufficiently small interval and get a linear term that — by the nature of Taylor series fits to nonlinear functions — is guaranteed to fail if extrapolated as higher order nonlinear terms kick in and ultimately dominate? Why even pay lip service to the notion that R^2 or p for a linear fit, or for a Kolmogorov-Smirnov comparison of the real temperature record and the extrapolated model prediction, has some meaning? It has none.

Let me repeat this. It has no meaning! It is indefensible within the theory and practice of statistical analysis. You might as well use a ouija board as the basis of claims about the future climate history as the ensemble average of different computational physical models that do not differ by truly random variations and are subject to all sorts of omitted variable, selected variable, implementation, and initialization bias. The board might give you the right answer, might not, but good luck justifying the answer it gives on some sort of rational basis.

Let’s invert this process and actually apply statistical analysis to the distribution of model results Re: the claim that they all correctly implement well-known physics. For example, if I attempt to do an a priori computation of the quantum structure of, say, a carbon atom, I might begin by solving a single electron model, treating the electron-electron interaction using the probability distribution from the single electron model to generate a spherically symmetric “density” of electrons around the nucleus, and then performing a self-consistent field theory iteration (resolving the single electron model for the new potential) until it converges. (This is known as the Hartree approximation.)

Somebody else could say “Wait, this ignore the Pauli exclusion principle” and the requirement that the electron wavefunction be fully antisymmetric. One could then make the (still single electron) model more complicated and construct a Slater determinant to use as a fully antisymmetric representation of the electron wavefunctions, generate the density, perform the self-consistent field computation to convergence. (This is Hartree-Fock.)

A third party could then note that this still underestimates what is called the “correlation energy” of the system, because treating the electron cloud as a continuous distribution through when electrons move ignores the fact thatindividual electrons strongly repel and hence do not like to get near one another. Both of the former approaches underestimate the size of the electron hole, and hence they make the atom “too small” and “too tightly bound”. A variety of schema are proposed to overcome this problem — using a semi-empirical local density functional being probably the most successful.

A fourth party might then observe that the Universe is really relativistic, and that by ignoring relativity theory and doing a classical computation we introduce an error into all of the above (although it might be included in the semi-empirical LDF approach heuristically).

In the end, one might well have an “ensemble” of models, all of which are based on physics. In fact, the differences are also based on physics — the physicsomitted from one try to another, or the means used to approximate and try to include physics we cannot include in a first-principles computation (note how I sneaked a semi-empirical note in with the LDF, although one can derive some density functionals from first principles (e.g. Thomas-Fermi approximation), they usually don’t do particularly well because they aren’t valid across the full range of densities observed in actual atoms). Note well, doing the precise computation is not an option. We cannot solve the many body atomic state problem in quantum theory exactly any more than we can solve the many body problem exactly in classical theory or the set of open, nonlinear, coupled, damped, driven chaotic Navier-Stokes equations in a non-inertial reference frame that represent the climate system.

Note well that solving for the exact, fully correlated nonlinear many electron wavefunction of the humble carbon atom — or the far more complex Uranium atom — is trivially simple (in computational terms) compared to the climate problem. We can’t compute either one, but we can come a damn sight closer to consistently approximating the solution to the former compared to the latter.

So, should we take the mean of the ensemble of “physics based” models for the quantum electronic structure of atomic carbon and treat it as the best predictionof carbon’s quantum structure? Only if we are very stupid or insane or want to sell something. If you read what I said carefully (and you may not have — eyes tend to glaze over when one reviews a year or so of graduate quantum theory applied to electronics in a few paragraphs, even though I left out perturbation theory, Feynman diagrams, and ever so much more:-) you will note that I cheated — I run in a semi-empirical method.

Which of these is going to be the winner? LDF, of course. Why? Because theparameters are adjusted to give the best fit to the actual empirical spectrum of Carbon. All of the others are going to underestimate the correlation hole, and their errors will be systematically deviant from the correct spectrum. Their mean will be systematically deviant, and by weighting Hartree (the dumbest reasonable “physics based approach”) the same as LDF in the “ensemble” average, you guarantee that the error in this “mean” will be significant.

Suppose one did not know (as, at one time, we did not know) which of the models gave the best result. Suppose that nobody had actually measured the spectrum of Carbon, so its empirical quantum structure was unknown. Would the ensemble mean be reasonable then? Of course not. I presented the models in the wayphysics itself predicts improvement — adding back details that ought to be important that are omitted in Hartree. One cannot be certain that adding back these details will actually improve things, by the way, because it is always possible that the corrections are not monotonic (and eventually, at higher orders in perturbation theory, they most certainly are not!) Still, nobody would pretend that the average of a theory with an improved theory is “likely” to be better than the improved theory itself, because that would make no sense. Nor would anyone claim that diagrammatic perturbation theory results (for which there is a clear a priori derived justification) are necessarily going to beat semi-heuristic methods like LDF because in fact they often do not.

What one would do in the real world is measure the spectrum of Carbon, compare it to the predictions of the models, and then hand out the ribbons to the winners! Not the other way around. And since none of the winners is going to be exact — indeed, for decades and decades of work, none of the winners was even particularly close to observed/measured spectra in spite of using supercomputers (admittedly, supercomputers that were slower than your cell phone is today) to do the computations — one would then return to the drawing board and code entry console to try to do better.

Can we apply this sort of thoughtful reasoning the spaghetti snarl of GCMs and their highly divergent results? You bet we can! First of all, we could stop pretending that “ensemble” mean and variance have any meaning whatsoever bynot computing them. Why compute a number that has no meaning? Second, we could take the actual climate record from some “epoch starting point” — one that does not matter in the long run, and we’ll have to continue the comparison for the long run because in any short run from any starting point noise of a variety of sorts will obscure systematic errors — and we can just compare reality to the models. We can then sort out the models by putting (say) all but the top five or so into a “failed” bin and stop including them in any sort of analysis or policy decisioning whatsoever unless or until they start to actually agree with reality.

Then real scientists might contemplate sitting down with those five winners and meditate upon what makes them winners — what makes them come out the closest to reality — and see if they could figure out ways of making them work even better. For example, if they are egregiously high and diverging from the empirical data, one might consider adding previously omitted physics, semi-empirical or heuristic corrections, or adjusting input parameters to improve the fit.

Then comes the hard part. Waiting. The climate is not as simple as a Carbon atom. The latter’s spectrum never changes, it is a fixed target. The former is never the same. Either one’s dynamical model is never the same and mirrors the variation of reality or one has to conclude that the problem is unsolved and the implementation of the physics is wrong, however “well-known” that physics is. So one has to wait and see if one’s model, adjusted and improved to better fit the past up to the present, actually has any predictive value.

Worst of all, one cannot easily use statistics to determine when or if one’s predictions are failing, because damn, climate is nonlinear, non-Markovian, chaotic, and is apparently influenced in nontrivial ways by a world-sized bucket of competing, occasionally cancelling, poorly understood factors. Soot. Aerosols. GHGs. Clouds. Ice. Decadal oscillations. Defects spun off from the chaotic process that cause global, persistent changes in atmospheric circulation on a local basis (e.g. blocking highs that sit out on the Atlantic for half a year) that have a huge impact on annual or monthly temperatures and rainfall and so on. Orbital factors. Solar factors. Changes in the composition of the troposphere, the stratosphere, the thermosphere. Volcanoes. Land use changes. Algae blooms.

And somewhere, that damn butterfly. Somebody needs to squash the damn thing, because trying to ensemble average a small sample from a chaotic system is so stupid that I cannot begin to describe it. Everything works just fine as long as you average over an interval short enough that you are bound to a given attractor, oscillating away, things look predictable and then — damn, you change attractors.Everything changes! All the precious parameters you empirically tuned to balance out this and that for the old attractor suddenly require new values to work.

This is why it is actually wrong-headed to acquiesce in the notion that any sort of p-value or Rsquared derived from an AR5 mean has any meaning. It gives up the high ground (even though one is using it for a good purpose, trying to argue that this “ensemble” fails elementary statistical tests. But statistical testing is a shaky enough theory as it is, open to data dredging and horrendous error alike, and that’s when it really is governed by underlying IID processes (see “Green Jelly Beans Cause Acne”). One cannot naively apply a criterion like rejection if p < 0.05, and all that means under the best of circumstances is that the current observations are improbable given the null hypothesis at 19 to 1. People win and lose bets at this level all the time. One time in 20, in fact. We make a lot of bets!

So I would recommend — modestly — that skeptics try very hard not to buy into this and redirect all such discussions to questions such as why the models are in such terrible disagreement with each other, even when applied to identical toy problems that are far simpler than the actual Earth, and why we aren’t using empirical evidence (as it accumulates) to reject failing models and concentrate on the ones that come closest to working, while also not using the models that are obviously not working in any sort of “average” claim for future warming. Maybe they could hire themselves a Bayesian or two and get them to recompute the AR curves, I dunno.

It would take me, in my comparative ignorance, around five minutes to throw out all but the best 10% of the GCMs (which are still diverging from the empirical data, but arguably are well within the expected fluctuation range on the DATA side), sort the remainder into top-half models that should probably be kept around and possibly improved, and bottom half models whose continued use I would defund as a waste of time. That wouldn’t make them actually disappear, of course, only mothball them. If the future climate ever magically popped back up to agree with them, it is a matter of a few seconds to retrieve them from the archives and put them back into use.

Of course if one does this, the GCM predicted climate sensitivity plunges from the totally statistically fraudulent 2.5 C/century to a far more plausible and stillpossibly wrong ~1 C/century, which — surprise — more or less continues the post-LIA warming trend with a small possible anthropogenic contribution. This large a change would bring out pitchforks and torches as people realize just how badly they’ve been used by a small group of scientists and politicians, how much they are the victims of indefensible abuse of statistics to average in the terrible with the merely poor as if they are all equally likely to be true with randomly distributed differences.

rgb

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

323 Comments
Inline Feedbacks
View all comments
Gary Hladik
June 20, 2013 11:22 am

rgbatduke says (June 20, 2013 at 10:04 am): [snip]
Whoa! Another home run!
While I could follow–in a very superficial way–the original modeling-a-carbon-atom analogy, I appreciate the more familiar–to me, anyway–analogies of the Polya Urn, the tides/Newton, and the Ptolemaic solar system.
“Even if the predictions of catastrophe in 2100 are true…it is still not clear that we shouldn’t have opted for civilization building first as the lesser of the two evils.”
Lomborg’s Copenhagen Consensus takes a similar view, giving “civilization building” a much higher priority than mitigating Thermageddon.
“…beyond dumb. Dumber than dumb. Dumb cubed. The exponential of dumb.”
Might I suggest the term “galactically stupid”? 🙂

Frank
June 20, 2013 11:36 am

rgbatduke: This statement from the introduction to Chapter 10 of AR4 WG1 shows that the IPCC authors understand your position, but persist in drawing “problematic” statistical conclusions from their “ensemble of opportunity”:
“Many of the figures in Chapter 10 are based on the mean and spread of the multi-model ensemble of comprehensive AOGCMs. The reason to focus on the multi-model mean is that averages across structurally different models empirically show better large-scale agreement with observations, because individual model biases tend to cancel (see Chapter 8). The expanded use of multi-model ensembles of projections of future climate change therefore provides higher quality and more quantitative climate change information compared to the TAR. Even though the ability to simulate present-day mean climate and variability, as well as observed trends, differs across models, no weighting of individual models is applied in calculating the mean. Since the ensemble is strictly an ‘ENSEMBLE OF OPPORTUNITY’, without sampling protocol, the spread of models does NOT NECESSARILY SPAN THE FULL POSSIBLE RANGE OF UNCERTAINTY, and a STATISTICAL INTERPRETATION of the model spread is therefore PROBLEMATIC. However, attempts are made to quantify uncertainty throughout the chapter based on various other lines of evidence, including perturbed physics ensembles specifically designed to study uncertainty within one model framework, and Bayesian methods using observational constraints.” [MY CAPS}

June 20, 2013 12:45 pm

rgbatduke says: June 20, 2013 at 10:04 am
“Figure 1.4 in the unpublished AR5 appears poised to do exactly the same thing once again, turn an average of ensemble results, and standard deviations of the ensemble average”
I can’t see any standard deviation claimed there. But I’m getting weary of a lone battle dealing with poorly specified and described claims against AR5, so I’m happy to hand over to W.M.Briggs, often cited as an authority here.
“I therefore repeat to Nick the question I made on other threads. Is the near-neutral variation in global temperature for at least 1/8 of a century (since 2000, to avoid the issue of 13, 15, or 17 years of “no significant warming” given the 1997/1999 El Nino/La Nina one-two punch since we have no real idea of what “signficant” means given observed natural variability in the global climate record that is almost indistinguishable from the variability of the last 50 years) strong evidence for warming of 2.5 C by the end of the century? “
Well, this was the actual topic of the original thread. And I’ve done my own version of SteveF’s calculation. And it turns out that when you look at different datasets and periods, and in particular get away from his use of just a linear function to fit the post-1950 period, then the exogenous variables of ENSO, Solar and volcanic aerosols (but mainly ENSO) do account for most of the slowdown. That is, if you remove them as SteveF did, there is a strong uptrend continuing to present. It happened that SteveF’s calc, for HAD4 and linear detrending was the one case that showed a weak remaining uptrend. Code and data are supplied.

June 20, 2013 2:30 pm

Nick Stokes: I gave the entire caption for AR5 figure 11.33. As RGB says, in (b) theyshow a mean and standard deviation envelope of the model results. It is nonsense. And worse, it can only be used to mislead policy makes and make AGW supoorters think we the models have predictive capability. They clearly don’t. I am continually astonished how clearly intelligent individuals like yourself can continue to defend the indefensible.

June 20, 2013 2:36 pm

RGB: that is the most elequent and powerful demolition of CAGW I have ever read. You have my deepest admiration and respect.

June 20, 2013 2:36 pm

Eloquent…!

June 20, 2013 3:02 pm

Nick,
Your insights are always welcome, so please never feel intimidated not to argue your case here or anywhere else. (I have suffered similarly on realscience).
There is an intrinsic problem at the base all post and discussion. There can be never be a unique “theory” of the climate. Quantum theory may calculate the magnetic dipole of the electron “exactly” but still cannot “predict” for example the Higgs mass. It lies beyond (current) theory..
There is nothing wrong in principal with running competing GCM models with varying physical input parameters to compare with future climates. However each run should be identified and described uniquely so as to be validated numerically if it turns out to accurately describe the future climate.
I suspect modellers know full well that this is impossible because their code is always evolving, and feedbacks and aerosols are always being varied in order to better match past warming. There is therefore an intrinsic uncertainty. That is why model runs are combined to produce an “ensemble” of future warming. These do indeed then resemble a fire-hose and accentuate their lack of predictive power . A cynic might be tempted to suggest this process is basic arse covering.
Surely it is best to be up front and honest to accept that inherent uncertainty will always remains. Why not simply accept that the “scientific consensus” basically covers just greenhouse theory. Feedbacks and climate sensitivity are still uncertain.

george e. smith
June 20, 2013 3:39 pm

“””””……JohnB says:
June 19, 2013 at 6:36 pm
I’ve always thought that averaging the models was the same as saying;
I have 6 production lines that make cars. Each line has a defect.
1. Makes cars with no wheels.
2. Makes cars with no engine.
3. Makes cars with no gearbox.
4. Makes cars with no seats.
5. Makes cars with no windows.
6. Makes cars with no brakes.
But “on average” they make good cars…….””””””
Well, I would say on average, they make truly lousy cars, because every one of them has some known defect.
None of the six production lines makes a functioning car.

Nick Stokes
June 20, 2013 3:57 pm

ThinkingScientist says: June 20, 2013 at 2:30 pm
“Nick Stokes: I gave the entire caption for AR5 figure 11.33. As RGB says, in (b) theyshow a mean and standard deviation envelope of the model results.”

Not true. Read it again. They give the median and quantiles. There’s no distribution assumed. It’s simply a description of their aggregate results.

Rob Ricket
June 20, 2013 4:59 pm

Nick,
Please look at the graph on 11-89. The word “mean” is clearly plastered on the graph in bold letters.
http://www.stopgreensuicide.com/Ch11_near-term_WG1AR5_SOD_Ch11_All_Final.pdf

Nick Stokes
June 20, 2013 5:37 pm

Rob Ricket says: June 20, 2013 at 4:59 pm
“Nick,
Please look at the graph on 11-89. The word “mean” is clearly plastered on the graph in bold letters.”

Yes, it is. We’ve been through this before, endlessly. The AR4 showed ensemble means, and talks at length about it. We discussed that. Roy Spencer shows ensembles means in a prominent WUWT post. W.M.Briggs says it’s fine.
But this post says someone did something very bad, deserving severe sanctions. It’s not clear what it was, and we still have no idea who did it or where (other than Lord M, who was pointer to but he didn;t do much either). But it wasn’t just calculating an ensemble mean.

June 20, 2013 5:38 pm

Anyone who has ever tried to employ mathematical models, calibrate them to the observed world, and then use the model to project into the future knows that, even with a small model, things are extremely difficult. When one tries to solve a system of nonlinear equations one comes to the same conclusion — need to fine-tune the model parameters to even find a solution (despite having calibrated to the real world), multiple attractors, etc.. This is an argument I make in my book “Climate Change, Climate Science and Economics” (Springer 2013). I am glad a physicist has finally come out so clearly and strongly on the matter.

Nick Stokes
June 20, 2013 5:38 pm

Grr “pointer tom” = pointed to
[Fixed -w.]

David
June 20, 2013 7:26 pm

Nick says…”But this post says someone did something very bad, deserving severe sanctions. It’s not clear what it was, and we still have no idea who did it or where (other than Lord M, who was pointer tom but he didn;t do much either). But it wasn’t just calculating an ensemble mean.”
Correct Nick, it was calculating an ensemble mean in the following circumstances, and claiming it, (the ensemble means) means anything but evidence of a failed hypothesis.
If the 10 percent of the models that are closest to the observations are all STILL wrong in the SAME direction, then this is, logically speaking, a clue that even those “best” models are systemically wrong and STILL oversenstive. In science, being CONSITENTLY wrong in ONE DIRECTION is a clue that a basic premise is wrong. In science being wrong is informative.
When one is ALLWAYS wrong to the oversenstive side of the equation , then you do not assume that the ensemble mean of all your unidirectionaly OVERSENSTIVE wrong predictions, gets you closer to the truth.

Nick Stokes
June 20, 2013 8:07 pm

David says: June 20, 2013 at 7:26 pm
“Correct Nick, it was calculating an ensemble mean in the following circumstances, and claiming it, (the ensemble means) means anything but evidence of a failed hypothesis.”

David, the claim that models are just wrong has been made loudly and often. It has convinced all those likely to be convinced. To make progress you need something else. RGB has tried to do that with a methods argument.
If you can make that argument, that someone put together results with wrong methods and it matters, then you might convince some people who think the models themselves have value. Such an argument would need to be properly referenced. It hasn’t been. But if you throw that away and just keep putting in caps that the models are wrong, there’s no progress.

John Bills
June 20, 2013 9:12 pm

Depending on the weather, some models are not wrong, some of the time,
War is peace freedom is slavery ignorance is strength warm is cold.

David
June 20, 2013 9:40 pm

Nich says, “David, the claim that models are just wrong has been made loudly and often.”
———————————————————
I did not just say, “the models are wrong”, So you are willfully creating a strawman. I said the models are consistently wrong in one direction. There bias is to always predict more warming then observed. Each one of the “caps” emphasizes that distinction, and furthermore shows the foolishness in moving away from the least wrong (but still wrong) models through an ensemble mean, and pretending that mean has any policy implications. (The fact that said policy, based on an asembled means of zero predictive skill, has cost billions, and impoverished many, is the crime which you are incapable of finding)
It is not complicated, and your willful missing the point is a poor reflection on you. Likewise in his last comment, and in a long previous comment, RGB specifically asked you six or seven pointed questions, which you never answered, as predicded by my Nick Stockes model which I label S.I.T. short for sophist intelligent troll.

Nick Stokes
June 20, 2013 9:55 pm

David says: June 20, 2013 at 9:40 pm
‘I did not just say, “the models are wrong”,’

It doesn’t matter here how the models are said to be wrong. RGB is presenting a methods argument which tries to show that the conclusions are wrong even if the model runs are right. Your argument is not about the method; it is about some claimed facts about the model results.
The bottom line is that people who believe your set of facts are (rightly, for them) not going to care much about the method argument. It’s only useful for people whose minds are open on the merits of models.
\

June 21, 2013 12:34 am

Nick – what difference does it make that they show the mean or the median + the envelope/variance of the models? What message is being conveyed? The point is that any summary of the models like this is useless.
The IPCC continually tries to pretend that these are just “scenarios” not predictions. If those “scenarios” are so unlikely as to be impossible, then what use are the models? This is not an academic exercise: IPCC reports are used to set public policies costing billions, even trillions of dollars. Do you think the summary for policymakers and the activists who take away those messages see your nuanced hair splitting over whether its the mean, or median or whatever? Those graphs have a message to deliver: the message is the models can predict the climate into the future and the future looks bad and we all need to act now.
As RGB points out, the models , whether plotted as spaghetti, or summarised as a median and envelope or any other pointless statistic you want to agonise over do not agree with reality and therefore should be disregarded. We cannot currently predict the future climate and given its non-linearity and complexity it is unlikely that we are going to be able to for a long time to come. If we cannot model the climate for even a short period with any degree of accuracy then we should stop doing so and admit that we don’t know what the future climate will be . Anything less is negligence and , if intended to mislead, criminal.

June 21, 2013 12:35 am

What is the difference between a “scenario” and a “prediction”.

Hoi Polloi
June 21, 2013 4:02 am

so I’m happy to hand over to W.M.Briggs, often cited as an authority here.

Interesting noticing that Stokes refers to Briggs when it suits him, never see that before. BTW have you read the update in Briggs blog, Stokes?

Although it is true ensemble forecasting makes sense, I do NOT claim that they do well in practice for climate models.

David
June 21, 2013 4:41 am

A scenario and a prediction are of course the exact same thing, if you base policy on them. But Nick may not admit that. RGB made every point I made, (and many others) and a lot of other points as well with regard to chotic systems, etc. Nick ignores the , every point I sumarised, all of which were within RGBs comments, , and instead concentrates on pedantic details of the relativist chaotic discussion. He refuses to anwer RGBs questions. He does not admit the glaring facts everyone else sees, but pretends his pedantic sophist disagreements with RGB, somehow makes the use of an ensemble mean for policy as OK, when the enesmble mean is clearly taking one further and further from the models most reflective of real world observations, and the policy is immensely destructive. .
The models are wrong in ONE direction. The ensemble mean is used for political purposes, not to find the best model, but to ignore the best models, and create a SCARY prediction. Everybody but lonely Nick sees this.

David
June 21, 2013 4:45 am

Ensemble forecasting does NOT make sense, when your errors are all biased in one directon.

Nick Stokes
June 21, 2013 5:07 am

David says: June 21, 2013 at 4:41 am
“The ensemble mean is used for political purposes, not to find the best model, but to ignore the best models, and create a SCARY prediction. Everybody but lonely Nick sees this.”
Lonely? There’s me mate W.M.Briggs. And Dr Roy Spencer. And just now Bob Tisdale, also on the pages of WUWT. I think we’re close to 97% 🙂

Tim Clark
June 21, 2013 6:28 am

{ Now, are the original or adjusted ensemble forecasts any good? If so, then the models are probably getting the physics right. If not, then not. We have to check: do the validation and apply some proper score to them. Only that would tell us. We cannot, in any way, say they are wrong before we do the checking. They are certainly not wrong because they are ensemble forecasts. They could only be wrong if they fail to match reality. (The forecasts Roy S. had up a week or so ago didn’t look like they did too well, but I only glanced at his picture.)
Although it is true ensemble forecasting makes sense, I do NOT claim that they do well in practice for climate models. I also dispute the notion that we have to act before we are able to verify the models. That’s nuts. If that logic held, then we would have to act on any bizarre notion that took our fancy as long as we perceived it might be a big enough threat. }
Regardless if the use of ensembles is appropriate, I don’t think Briggs is giving the output a rousing endorsement.

Verified by MonsterInsights