By Christopher Monckton of Brenchley
This time last year, as the honorary delegate from Burma, I had the honor of speaking truth to power at the Doha climate conference by drawing the attention of 193 nations to the then almost unknown fact that global warming had not happened for 16 years.
The UN edited the tape of my polite 45-second intervention by cutting out the furious howls and hisses of my supposedly grown-up fellow delegates. They were less than pleased that their carbon-spewing gravy-train had just tipped into the gulch.
The climate-extremist news media were incandescent. How could I have Interrupted The Sermon In Church? They only reported what I said because they had become so uncritical in swallowing the official story-line that they did not know there had really been no global warming at all for 16 years. They sneered that I was talking nonsense – and unwittingly played into our hands by spreading the truth they had for so long denied and concealed.
Several delegations decided to check with the IPCC. Had the Burmese delegate been correct? He had sounded as though he knew what he was talking about. Two months later, Railroad Engineer Pachauri, climate-science chairman of the IPCC, was compelled to announce in Melbourne that there had indeed been no global warming for 17 years. He even hinted that perhaps the skeptics ought to be listened to after all.
At this year’s UN Warsaw climate gagfest, Marc Morano of Climate Depot told the CFACT press conference that the usual suspects had successively tried to attribute The Pause to the alleged success of the Montreal Protocol in mending the ozone layer; to China burning coal (a nice irony there: Burn Coal And Save The Planet From – er – Burning Coal); and now, just in time for the conference, by trying to pretend that The Pause has not happened after all.
As David Whitehouse recently revealed, the paper by Cowtan & Way in the Quarterly Journal of the Royal Meteorological Society used statistical prestidigitation to vanish The Pause.
Dr. Whitehouse’s elegant argument used a technique in which Socrates delighted. He stood on the authors’ own ground, accepted for the sake of argument that they had used various techniques to fill in missing data from the Arctic, where few temperature measurements are taken, and still demonstrated that their premises did not validly entail their conclusion.
However, the central error in Cowtan & Way’s paper is a fundamental one and, as far as I know, it has not yet been pointed out. So here goes.
As Dr. Whitehouse said, HadCRUTt4 already takes into account the missing data in its monthly estimates of coverage uncertainty. For good measure and good measurement, it also includes estimates for measurement uncertainty and bias uncertainty.
Taking into account these three sources of uncertainty in measuring global mean surface temperature, the error bars are an impressive 0.15 Cº – almost a sixth of a Celsius degree – either side of the central estimate.
The fundamental conceptual error that Cowtan & Way had made lay in their failure to realize that large uncertainties do not reduce the length of The Pause: they actually increase it.
Cowtan & Way’s proposed changes to the HadCRUt4 dataset, intended to trounce the skeptics by eliminating The Pause, were so small that the trend calculated on the basis of their amendments still fell within the combined uncertainties.
In short, even if their imaginative data reconstructions were justifiable (which, as Dr. Whitehouse indicated, they were not), they made nothing like enough difference to allow us to be 95% confident that any global warming at all had occurred during The Pause.
If one takes no account of the error bars and confines the analysis to the central estimates of the temperature anomalies, the HadCRUt4 dataset shows no global warming at all for nigh on 13 years (above).
However, if one displays the 2 σ uncertainty region, the least-squares linear-regression trend falls wholly within that region for 17 years 9 months (below).
The true duration of The Pause, based on the HadCRUT4 dataset approaches 18 years. Therefore, the question Cowtan & Way should have addressed, but did not address, is whether the patchwork of infills and extrapolations and krigings they used in their attempt to deny The Pause was at all likely to constrain the wide uncertainties in the dataset, rather than adding to them.
Publication of papers such as Cowtan & Way, which really ought not to have passed peer review, does indicate the growing desperation of institutions such as the Royal Meteorological Society, which, like every institution that has profiteered by global warming, does not want the flood of taxpayer dollars to become a drought.
Those driving the scare have by now so utterly abandoned the search for truth that is the end and object of science that they are incapable of thinking straight. They have lost the knack.
Had they but realized it, they did not need to deploy ingenious statistical dodges to make The Pause go away. All they had to do was wait for the next El Niño.
These sudden warmings of the equatorial eastern Pacific, for which the vaunted models are still unable to account, occur on average every three or four years. Before long, therefore, another El Niño will arrive, the wind and the thermohaline circulation will carry the warmth around the world, and The Pause – at least for a time – will be over.
It is understandable that skeptics should draw attention to The Pause, for its existence stands as a simple, powerful, and instantly comprehensible refutation of much of the nonsense talked in Warsaw this week.
For instance, the most straightforward and unassailable argument against those at the U.N. who directly contradict the IPCC’s own science by trying to blame Typhoon Haiyan on global warming is that there has not been any for just about 18 years.
In logic, that which has occurred cannot legitimately be attributed to that which has not.
However, the world continues to add CO2 to the atmosphere and, all other things being equal, some warming can be expected to resume one day.
It is vital, therefore, to lay stress not so much on The Pause itself, useful though it is, as on the steadily growing discrepancy between the rate of global warming predicted by the models and the rate that actually occurs.
The IPCC, in its 2013 Assessment Report, runs its global warming predictions from January 2005. It seems not to have noticed that January 2005 happened more than eight and a half years before the Fifth Assessment Report was published.
Startlingly, its predictions of what has already happened are wrong. And not just a bit wrong. Very wrong. No prizes for guessing in which direction the discrepancy between modeled “prediction” and observed reality runs. Yup, you guessed it. They exaggerated.
The left panel shows the models’ predictions to 2050. The right panel shows the discrepancy of half a Celsius degree between “prediction” and reality since 2005.
On top of this discrepancy, the trends in observed temperature compared with the models’ predictions since January 2005 continue inexorably to diverge:
Here, 34 models’ projections of global warming since January 2005 in the IPCC’s Fifth Assessment Report are shown an orange region. The IPCC’s central projection, the thick red line, shows the world should have warmed by 0.20 Cº over the period (equivalent to 2.33 Cº/century). The 18 ppmv (201 ppmv/century) rise in the trend on the gray dogtooth CO2 concentration curve, plus other ghg increases, should have caused 0.1 Cº warming, with the remaining 0.1 ºC from previous CO2 increases.
Yet the mean of the RSS and UAH satellite measurements, in dark blue over the bright blue trend-line, shows global cooling of 0.01 Cº (–0.15 Cº/century). The models have thus already over-predicted warming by 0.22 Cº (2.48 Cº/century).
This continuing credibility gap between prediction and observation is the real canary in the coal-mine. It is not just The Pause that matters: it is the Gap that matters, and the Gap that will continue to matter, and to widen, long after The Pause has gone. The Pause deniers will eventually have their day: but the Gap deniers will look ever stupider as the century unfolds.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
M Simon says:
November 20, 2013 at 6:10 am
Well, it could be slight sleight of hand, or it could be profound sleight of hand. I’ll leave that to the reader.
Have commented before but the total sea ice area for year 2013 is going to be above average for a WHOLE YEAR very shortly. This will put a big crimp in any “Kriging” rubbish by Cowtan et al.
With the extent positive any Arctic amplification will be wiped out by Antarctic Deamplification and there will be a Hockey stick spike down in global cooling for 2013. Cannot wait.
Metamorphosis of Climate Change.
The hockey schtick ‘s transformed
into a playing field of ups ‘n downs,
that show the variability we know
is climate change. It’s strange that
climate modellers in cloud towers
never recognised that the complex
inter-actiing ocean-land and atmo-
spheric system that is our whether
does
this.
Beth the serf.
Lewis P Buckingham:
I’m not sure the case of dark matter is really analogous to the case of CO2. In the case of dark matter, we know *something* is there, because there is not enough visible matter in galaxies to account for the motion of stars on the outer edges of those galaxies–in other words, the total mass of the galaxy inferred from its total gravity, which determines the motions of its stars, is larger than the mass we can see. The question is what the extra mass is, since it doesn’t show up in any other observations besides the indirect inference from the galaxy’s total gravity.
In the case of CO2, however, we know what the “dark matter” is–we can directly detect the CO2 in the atmosphere. The question is how much effect that CO2 has on the climate, i.e., how much “gravity” it exerts: the climate is warming *less* than the models say it ought to given how much CO2 concentrations have risen. In other words, rather than seeing an indirect effect but not directly observing the dark matter that causes it, we see the “dark matter”–the CO2–but we don’t see the indirect effect that the models claim should be there.
I disagree. Earth and models both are systems of chaotic weather, for which after a period of time a climate can be discerned. The timing of GCM weather events is not informed by any observations of the timing of Earth events; they are initialized going way back and generate synthetic weather independently. This is true of medium term events like ENSO; if the Earth happens to have a run of La Nina’s (as it has), there is no way a model can be expected to match the timing of that.
The only thing we can expect them to have in common is that long run climate, responding to forcings. If you test models independently, that will take a very long time to emerge with certainty. If you aggregate models, you can accelerate the process.
Sorry it took a day to get back to this, but I do think it is important to respond.
I agree that the climate system is nonlinear and chaotic. I’m not precisely certain what you mean about “after a time a climate can be discerned”, since a glance at the historical record suffices to demonstrate that either climate is discernible after a very short time or the earth has no fixed climate — the climate record is always moving, never stable as one would expect from a nonlinear chaotic system. Was the LIA part of the “climate”? Apparently not, it only lasted a century or so. Was the rise out of the LIA a stable “climate”? Not at all — things warm, things cool. The autocorrelation time of GASTA is what, at most 20 years across the last 160? More likely 10. Noting well that forming the autocorrelation of an anomaly is more than a bit silly — the actual autocorrelation time of the climate e.g. GAST is infinite, and what one is looking at with GASTA is fluctuation-dissipation, of the fluctuation-dissipation theorem, not first order autocorrelation of the global average surface temperature (or any other aspect of “climate”, as they all not only vary, there are clear if transient periodicities and both long term and short term trends).
As for the GCMs not being informed by actual weather, sure. However, they take great pains to do a Monte Carlo sampling of initial conditions for the precise reason that — if they work — the distribution of outcomes should in some sense be representative of the actual distribution of outcomes. It is in precisely this sense that individual GCMs should be rejected. The sampling itself produces a p-value for the actual climate, and ones where nearly all runs spend nearly all of their time consistently above the actual climate — pardon me, “weather” — can form the basis for a perfectly legitimate test of the null hypothesis “this GCM is a quantitatively accurate predictor of the climate”. Otherwise you are in the awkward position of any “believer” whose pet hypothesis isn’t in good correspondance with reality — forced to assert that reality is somehow in an improbable state instead of in the most probable state. Sure, this could always be true — but one shouldn’t bet on it. Literally. Even though you seem to be doing just that.
When you assert that they are failing because there has been e.g. “a run of La Ninas”, you are asserting first and foremost that they are failing and you are grasping for a reason to explain the failure. This is very likely to be one of many reasons for the failure, and the failure to predict GASTA is one of many failures, such as the failure to predict the correct kind of behavior for LTT. So we actually seem to be in agreement that they are failing, or you wouldn’t make excuses. It would be a bit simpler — and arguable a bit more honest — to state “Yes, the GCMs are failing, and here is a possible explanation for why” rather than asserting that they are correct even though they are failing. And even your remark about La Ninas concedes the further point that natural variation is in fact responsible for a lot more of the climate’s total variation than the IPCC seems willing to acknowledge — you can’t have it both ways that natural variation is small and that natural variation (in the form of a run of La Ninas) is responsible for the lack of warming. I could just as easily turn around and — with considerable empirical justification — point out that the 1997-1998 Super El Nino is the only event that has produced visible warming of the climate in the entire e.g. UAH or RSS record or for that matter, in HADCRUT4 in the last 33 years:
http://www.woodfortrees.org/plot/hadcrut4gl/from:1980/to:2013
Then one has to contend with the question of whether or not it is ENSO — by your own asssertion an unpredictable natural cycle that can without any question cause major heating or cooling episodes that are independent of CO_2 — that is the dominant factor in the time evolution of the climate, and obviously, a factor that is neither correctly predicted nor correctly accounted for in GCMs.
All of this is instantly apparent for your apologia for GCMs. However, it is the last assertion that I am writing to be sure to reply to, as it is rather astounding. You assert that aggregating independently conceived and executed GCM results will somehow “speed their convergence” to some sort of long term prediction.
Nick, you know that I respect you and take what you say seriously, but this is an absolutely indefensible assertion not in physics, but in the theory of statistics. It is quite simply incorrect, and badly incorrect at that, empirically incorrect. The GCMs already do Monte Carlo sampling over a phase space of initial conditions. One at a time, this is perfectly legitimate as long as random numbers with some sane distribution are used to produce the sampling. However, the GCMs themselves do not converge to a common prediction on any timescale. Their individual Monte Carlo averages do not converge to a common prediction on any timescale. GCMs are not pulled out of an urn filled with GCMs with some sort of distribution of errors around the “one true GCM” — or rather, they might be but there is no possible way to prove this a priori and it almost certainly is not true, based on the data instead of your religious belief that the models must have all of the physics right in spite of the fact that they don’t even agree with each other, and disagree badly with the actual climate in at least some cases.
If you think that you can prove your assertion, that an arbitrary set of models that individually do not converge to a single behavior and that may well contain shared systematic errors that prevent all of them from converging to the correct behavior at all must necessarily converge to the correct behavior faster when you average them, I’d love to see the proof. I’ve got a pretty good understanding of statistics and modelling — I make money at the game, have a major interest in a company that does this sort of thing for a living, and spent 15 to 20-odd years of my professional career doing large scale Monte Carlo simulations, and the last six or seven writing dieharder, which is basically an extended exercise in computational hypothesis testing using statistics and (supposedly) random numbers.
I would bet a considerable amount of money that you cannot prove, using the axioms and methods of statistics, that a single GCM will ever converge to the correct climate (given that if true, good luck telling me which one since they all go to different places in the long run, which is one of many reasons that the climate sensitivity is such a broad range even for the GCMs, given that they don’t even agree on the internal parametric values of physical quantities that clearly are relevant to their predictions).
I would bet a further considerable amount of money that you cannot prove that either the parametric initialization of different climate models or their internal structure can in any possible sense be asserted to have been drawn out of some sort of urn by a random process, whereby you fail in the first, most elementary requirement of statistics — that samples in some sort of averaging process be independent and identically distributed — where the samples in question are GCMs themselves. Monte Carlo of initialization of a model is precisely this, which is why the distribution of outcomes has some (highly conditional) meaning. But suppose I simply copy one model 100 times (giving its predictions 100x the weight of any other)? Is this going to somehow “accelerate the convergence process”?
Of course not. You can average 1000, 10000, 10^18 incorrect models and no matter how small you make the variance around the mean, you will have absolutely no theoretical basis for asserting that the mean is a necessary predictor for reality for anything but a vanishingly small class of incorrect models — single models that are individually correct (although how you know this a priori is and will continue to be an issue) but that have precisely the sort of “incorrectness” one can associate with e.g. white noise in the initial conditions and can compensate for by sampling. And to be quite honest, for chaotic dynamical systems it isn’t clear that one can compensate, even with this sort of sampling. In fact, the definition of a chaotic dynamical system is that it is one where this does not as a general rule happen, where tiny perturbations of initial conditions leads to wildly different, often phase-space fillingly different, final states.
There is, in other words, at least a possibility that if a butterfly flaps its wings just right, the Earth will plunge into an ice age in spite of increased CO_2 and all the rest. Or, a possibility that even without an increase in CO_2, the Earth would have continued to emerge from the LIA on almost exactly the observed track, or even warmed more than it has warmed. Or anything in between. That’s why the “unprecedented” phrase is so important in the political movement to assert that the science is beyond question. In actual fact, the climate record has ample evidence of variability that is at least as large as everything observed in the last 1000 years, and that even greater variability existed in the previous — fill in the longer interval of your choice — the evidence continues back 600 milllion years with cycles great and small, which the current climate being tied for the coldest climate the planet has ever had over 600 million years.
But climate science is never going to make any real progress until it acknowledges that the GCMs can be, and in some sense probably are, incorrect. Science in general works better when the participants lack certainty and possess the humility to acknowledge that even their most cherished beliefs can be wrong. So much more so when those beliefs are that the most fiendishly difficult computational simulation humans have ever undertaken, one that simulates a naturally nonlinear and chaotic system with clearly evident multistability and instability in its past behavior, is beyond question getting the right answer even as its predictions deviate from observations.
That sounds like rather bad science to me.
rgb
In your amazing silent-assassin way, did you just trash Dark Matter/Energy? I fervently hope so, I hate them both. Sorry to be slightly off-topic, but Professor Brown is so erudite I almost assume that whatever he says, must be true!
No, how can I do that? It’s (so far) an invisible fairy model, but gravitons are also (so far) invisible fairies. So are magnetic monopoles. It’s only VERY recently that the Higgs particle (maybe) stopped being an invisible fairy.
Physics has a number of cases where the existence of particles was inferred indirectly before the particles were directly observed, and a VERY few cases where we believe in them implicitly without being able to in any sense “directly” see them (quarks, for example — so much structure that we cannot disbelieve in them but OTOH we cannot seem to generate an isolated quark and even have theories to account for that). Positrons, neutrinos. But in all cases the physics community hasn’t completely believed in them before some experimentalist put salt on one’s tail (or the indirect evidence became overwhelming and the theory was amazingly predictive).
So I don’t need to “trash” dark matter/energy. It is one of several alternative hypotheses that might explain the data, where the list might not be exhaustive and might not contain the correct explanation (yet). It’s a particularly easy way to explain some aspects of the observations — invisible fairy theories often are, which is part of their appeal — but it is very definitely a “new physics” explanation and hence even proponents of the theory, if they are honest, will admit that it isn’t proven and could be entirely wrong, and the most honest of them could list some criteria for falsifying or verifying the hypothesis (which the most saintly of all would openly acknowledge is and will remain PROVISIONALLY falsifying/verifying, because evidence merely strengthens or weakens the hypothesis, it doesn’t really “prove” or “disprove” it, until we have a complete theory of everything, and Godel makes it a bit unlikely that we will ever have a complete theory of everything and should count ourselves lucky to have a mostly CONSISTENT theory of a lot of stuff SO FAR.
“So far”, or “yet” is the key to real science. We have a set of best beliefs, given the data and a requirement of reasonable consistency, so far. I’m a theorist and love a good theory, but when experimentalists speak, theorists weep. Or sometimes crow and cheer, but often weep. As Anthony is fond of pointing out, the entire CAGW debate between humans is ultimately moot. Nature will settle it. If the GCMs are correct, sooner or later GASTA will jump up to rejoin their predictions. If they are not correct, the emerging gap may — not will but may — continue to widen. Or do something else. They could even be incorrect but GASTA COULD jump up to rejoin them before doing something else. But what nature does is the bottom line, not the theoretical prediction. We (must) use the former to judge the latter, not the other way around.
rgb
“Invisible fairy models,” good enough for me. How about String Theory and Super-String Theory? Is there any way for a human mind to accommodate 11 dimensions? Try as I will, I still only see three…
the 1997-1998 Super El Nino is the only event that has produced visible warming of the climate in the entire e.g. UAH or RSS record or for that matter, in HADCRUT4 in the last 33 years
>>>>>>>>>>>>>>>>>>>>>>>>>>>
wow.
And how do you know (first principles) that they are forcings? Well, they just gots to be” ? Or the most relevant ones? Or that they aren’t counteracted by more muscular feedback loops? Note that NO other “science” refers to forcings. It is a convenient fiction unique to CS. Wonder why?
Kind of backwards, or it seems so. Water vapor is the forcing and carbon dioxide is a feedback in an inverse relationship with the water vapor concentration. When there is high humidity co2 means squat except in the cases of very low humidity then you have an effect from the co2 for it is then the sole player devoid of the h2o state changes. This is only low or mid in the troposphere but co2 always has its place at or above the TOA shedding heat just like the high altitude water vapor.
rgbatduke says: November 22, 2013 at 8:31 am
“the earth has no fixed climate… “
RGB, I didn’t say it did. Climate, for this purpose, can be defined as the timescale on which the response to variation in forcings dominates. There’s a widespread view here that the LIA was a response to a sunspot minimum. I don’t know how sound that is, but it is a typical argument for climate from forcing.
GCMs, with or without forcing variations, produce all kinds of weather, as we are familiar with, on a time scale at least up to ENSO. The timing of this weather is random. It does not synchronize across runs, and there is no reason to expect it to synchronize with Earth. The only thing common between runs, and with Earth, in terms of timing is the forcing. There you can expect that runs and Earth should track each other. It’s on a multi-decadal scale when the effects of random weather (incl ENSO) have averaged out.
I don’t understand your reference to careful Monte Carlo initialization. They may do that as a matter of good practice, but in fact what they really try hard to do is to obliterate the effect of initial conditions. Here is text from a slide from a CMIP overview:
“>Modelers make a long pre-industrial control
>Typically 1850 or 1860 conditions
>Perturbation runs start from control
>Model related to real years only through radiative forcing, solar, volcanoes, human emissions, land use etc
> Each ensemble an equally likely outcome
> Do not expect wiggles to match – model vs obs
“
That summarizes what I’ve been saying. Here is a 2004 paper by the same people summarising the then state of initialization and recommending a change. The then state was basically do your best to get anything right that might have long-term implications – mainly ocean distribution of heat. But of course, in 1850, that’s guesswork. The new idea is to get the present state, wind back to 1850, and then run forward. Obviously none of this is designed to predict from initial conditions.
“When you assert that they are failing because there has been e.g. “a run of La Ninas”, you are asserting first and foremost that they are failing… “
No, they are not failing. They never intended to predict La Ninas. In fact no one can, any more than they can predict volcanoes. That’s why you need to wait for the response to forcing to emerge. That;’s all that GCM’s claim to have in common with Earth. And it’s what they are designed to study.
That’s why asking that models be selected according to whether they agree with decadal trends is futile. It’s like selecting a mutual fund on last years results. You just get the ones that got lucky.
“natural variation is in fact responsible for a lot more of the climate’s total variation than the IPCC seems willing to acknowledge”
Do you have a quote on what the IPCC says? I don’t think they are reluctant to acknowledge natural variation. This comes back to the persistent fallacy that AGW is being deduced from the temperature record, and so natural variation as an alternative has to be denied. It isn’t, and natural variation simply delays the ability to discern the AGW signal. It doesn’t mean it isn’t there. That’s the usual complaint here – that scientists ask for too much patience.
“You assert that aggregating independently conceived and executed GCM results will somehow “speed their convergence” to some sort of long term prediction.”
Yes. This is routine in CFD. If you want to get a proper picture of vortex shedding, aggregate over a number of cycles (which might as well be separate runs), with careful matching. Again, you have a synchronised response to forcing plus random variation. Putting them together has to reinforce the common signal relative to the fluctuations. It may be hard to get population statistics, but the signal will be preferentially selected.
“the models must have all of the physics right in spite of the fact that they don’t even agree with each other”
They have to have a lot of physics right, just to run. They don’t claim to get common weather. And yes, models may not converge to the correct climate. They may indeed have biases. But aggregation of model runs will diminish random fluctuations. You’re right that that doesn’t prove that the combined result is a correct prediction, but it is a more coherent prediction.
Again I recommend this video of SST simulations from GFDL CM2. At about 2 min, it shows the Gulf stream. There are all sorts of wiggles and bends which are peculiar to this run alone, and I think the model may exaggerate them. But the underlying current, which is a response to forcing, is clear enough, and common from year to year, and I’m sure from run to run. Now here the signal is strong, but it would be clearer still if enough runs were superimposed so that the wiggles were damped and the common stream emerged.
“Invisible fairy models,” good enough for me. How about String Theory and Super-String Theory? Is there any way for a human mind to accommodate 11 dimensions? Try as I will, I still only see three…
If you do physics at all, you take linear algebra and several courses on ODEs and PDEs. In the process you learn to manage infinite dimensional linear vector spaces, because Quantum Theory in general is built on top of L^2(R^3) — the set of square-integrable functions on 3d Real space — which is an infinite-dimensional vector space.
There is also a fine line between a multidimensional algebra — the sort of thing you’d used to manipulate functions of five or ten or a hundred variables — and a multidimensional geometry. It’s pretty simple to “geometrize” sucn an algebra by constructing projectors (tensors and tensor operators).
It’s funny you ask about visualizing more than three dimensions. Our brains are evolved to conceptualize SL-R(3+1) because that’s where we apparently live, but I actually spend some time trying to imagine higher dimensional geometries. It would probably be easier if I dropped a bit of acid, but I do what I can without it…;-)
I’m guessing that the human brain is perfectly CAPABLE of “seeing” four-plus spatial dimensions, but we simply lack inputs with data on them that can be interpreted in that way. Our binocular vision and hearing and our spatial sensory input from our skin doesn’t contain the right encoding. But neural networks are pretty amazing, and our brains can adapt and repurpose neural hardware (and often has to, when e.g. we have strokes or accidents. That’s why I try to do the visualization — my brain is also capable (in principle) of synthesizing its own e.g. four dimensional input, and brain exercises help maintain and develop intelligence.
rgb
“You assert that aggregating independently conceived and executed GCM results will somehow “speed their convergence” to some sort of long term prediction.”
Yes. This is routine in CFD. If you want to get a proper picture of vortex shedding, aggregate over a number of cycles (which might as well be separate runs), with careful matching. Again, you have a synchronised response to forcing plus random variation. Putting them together has to reinforce the common signal relative to the fluctuations. It may be hard to get population statistics, but the signal will be preferentially selected.
I think you are still missing the point. One doesn’t aggregate over separately written PROGRAMS — certainly not unless you have direct experience of the programs independently working correctly — you aggregate over separate RUNS of ONE program with introduced randomness in its internals or in its initialization. This is stuff I understand very well indeed — Monte Carlo, Markov chains, and so on. Within one program this is perfectly valid and indeed best practice for any program that produces a non-deterministic result or a result that is basically a sample from a large space of alternatives. Many such samples can give you a statistical picture of the outcomes of the program being used, nothing more. To the extent that that program is reliable — something that is always an a posteriori issue, and I speak as a professional programmer here — this statistical picture may be of use. This is just as true for your CFD code as it is for any other piece of code ever written.
In the specific case of predictive modelling there is precisely one way to validate a predictive model. You use it to make predictions outside of the training data used to build the model, and you compare its predictions to the actually observed result. This is the foundation of physics itself (which is the mother of all predictive models, after all). For particularly complex models, after you validate the model, there is one remaining important step. That is, pray that your training and trial set capture the full range of variability of the system being modeled so that it keeps on working to predict new observational data as it comes in, without any particular bound. Many models — most models in highly multivariate, nonlinear, and especially chaotic systems with their Lyupanov exponents — will fail at some point. The conditions they were tuned for in the training set are themselves slowly varying parameters, or the system isn’t Markovian, or it is butterfly-effect sensitive. The streets are littered with homeless people who thought they could beat the market with a clever model just because they built a model that worked for past data and even predicted new data for a time.
Absolutely none of this is relevant to applying multiple, independently written models to the same problem. The same statistical principles that make it a GOOD idea to sample the space within a single model do not apply between models. Suppose you have two CFD programs that you can use to do your work. You purchased them from two different companies, and you know absolutely nothing about how they were written, their quality, or even whether or not they will “work” for your particular problem. You can probably assume that they went through “some” validation process in the companies that are selling them, but you cannot be certain that the validation process included systems with the level of detail or the particular shapes present in the system you are trying to simulate.
Do you want to assert that your best practice is to run both programs a few dozen times on your problem and average the results, because the average of the two programs — that, it rapidly becomes apparent, lead to completely different answers — is certain to be more accurate than either program by itself?
Of course not. First of all, you have no idea how accurate each program is for your particular problem independently! Second, there isn’t any good reason to think that errors in one program will cancel errors in the other, because the programs were not selected from an “ensemble of correctly written CFD programs” — this begs the question, you do not KNOW if they were correctly written or (more important) if they work, yet. If both programs contain the same error for some error, both will on average produce erroneous results! You cannot eliminate errors or inadequacies or convergence problems by averaging, except by accident.
All you’ve learned from running the programs a few dozen times and observing that they give different results (on average) is that both of your CFD programs cannot be correct. You have no possible way to determine (just from looking at the distributions of their independently obtained results) which one is correct, and of course they could both be incorrect. You cannot assume that errors in one will compensate for errors in the other — for all you know, one of the two is precisely correct, and averaging in the erroneous one will strictly worsen your predictions.
There is only one way for you to determine which program you should use. Apply them independently to problems where you know the answer and compare their results. And in the end, apply them to your problem (with its a priori unknown answer) and see if they work. Since they give different answers, one of them will give better results than the other. You will be off using the one that gives the better answer rather than using them both, once you’ve determined that one of them isn’t doing well.
The point is that neither CFD programs nor GCM programs constitute an ensemble of correctly written programs. There is no such thing as an a priori ensemble of correctly written programs. There is an ensemble of written programs, some of which may be correctly written. Averaging over the set of written programs does not produce a correctly written program, even on average.
Do you have a quote on what the IPCC says? I don’t think they are reluctant to acknowledge natural variation.
You mean the bit where they repeatedly assert that over half of the warming observed from 1950 or thereabouts on is due to increased CO_2? I could look up the exact lines (it occurs more than one time and has occurred repeatedly in all of the ARs) in the AR5 SPM, but why bother? I’m sure you’ve read it. Don’t you remember it? Heck, I can remember it being said in the Senate hearings. I didn’t expect you to disagree with this, I have to admit.
At the moment, BTW, the “natural variation” in question is roughly 0.6 C over 20 years difference between Nature with its natural variation and the worst GCMs with their unnatural variations. You might note that this is a quantity on the same order (only a bit larger than) the entire warming observed over sixty years in e.g. HADCRUT4. It doesn’t even make sense.
I have to ask — have you looked at the performance of some of the individual models compared to reality? Do you seriously think that one of the models that is predicting almost three times as much warming as the “coolest” of the models (which are, in turn, still substantially warmer than observation) is still just a coin flip, just as likely to be correct as any other? If you applied your CFD programs to your real world problems and one of those programs consistently gave results that caused your jets to fall from the sky and was in substantial disagreement with the programs that gave the best empirical results in application to actual problems, would you keep using it and averaging it in because everybody knows that more models is better than fewer ones?
rgb
“Do you seriously think that one of the models that is predicting almost three times as much warming as the “coolest” of the models (which are, in turn, still substantially warmer than observation) is still just a coin flip, just as likely to be correct as any other?”
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Coin flip? Please. Most climate models are mere propaganda. Extreme models are included to skew the bastardized mean and affect public opinion and policy. To exclude failing models (let’s be generous – 85%) would cause warming predictions to crater and negatively affect red/green propaganda and policy. Can’t have that; hence, Nick Stoke’s mendacity.
rgb;
It is said, a man with one watch knows the time. A man with 2 watches is uncertain. 😉
Coin flip? Please. Most climate models are mere propaganda. Extreme models are included to skew the bastardized mean and affect public opinion and policy. To exclude failing models (let’s be generous – 85%) would cause warming predictions to crater and negatively affect red/green propaganda and policy. Can’t have that; hence, Nick Stoke’s mendacity.
is positive definite and results in a deviation in the same direction as the correct consideration of exchange, so we can be certain of the sign of the error in theories that neglect or incorrectly treat this term. Yet in the search for the best possible solution to the problem of quantum electronics, physicists are constantly comparing their computations to nature, seeking to improve their theory and computation and thereby their results, and not hesitating to abandon approaches as they prove to be inadequate or systematically in error as better approaches come along. We never have people presenting the results of many of these computations applied to a benchmark atom and asserting that the average of the many computations is more reliable than the models in the best agreement with the actual result for the benchmark atom, nor do we have anyone who would assert that because Hartree-Fock and perturbation theory does a decent job at giving you the energy levels of Helium that we can absolutely rely on it for a computation of Uranium, or for that matter Oxygen, or Silicon, or Iron — let alone for Silicon Dioxide, or for polycrystalline Iron.
Climate models are not “mere propaganda”. I’m quite certain that the people who wrote and who are applying the models to make predictions do so in reasonably good faith. Again, this sort of thing is commonplace in physics and e.g. quantum chemistry, another problem that is almost too difficult for us to compute.
On another thread, long ago, I pointed out how solving the problem of computing the electronic energy levels of atoms and molecules has many similarities to the problem of predicting the climate. In both cases one has to solve an equation of motion that cannot be solved analytically, and where simply representing the exact solution for even a fairly small molecule requires nearly infinite storage (imagine a Slater determinant for the 48 electrons in CO_2, each with two levels of spin — even the ground state is almost impossible to represent). To make anything like a reasonable, computable theory, a single-electron approach has to manage two “corrections” inherited from this non-representable, non-computable physics) — the so-called “exchange energy” associated with the requirement that the final wavefunction be fully antisymmetric in electron exchange (satisfy the Pauli exclusion principle as electrons are Fermions) and the “correlation energy” that is a mixture of everything else left out in the approach — relativistic corrections, and the many body corrections resulting from the fact that electrons as strongly repulsive so that the wavefunction should vanish whenever any two electron coordinates are the same — the so called “correlation hole” in the wavefunction. Single electron solutions cannot represent an actual correlation hole.
Historically, attempts to solve even the problem of atomic states have worked through a progression of models. Perturbation theory was invented, allowing solutions to be represented in a single-electron basis of solutions to a simpler (but related) problem. The Schrodinger equation was generalized into the relativistic Dirac equation (that also accounted for spin). The many-body interaction was first treated using the Hartree approximation — each electron presumed to move in an “average” potential produced by all of the other electrons, basically — then improved with a Slater determinant for small enough problems into Hartree-Fock, which included exchange by making the wavefunction fully antisymmetric by construction. To do larger atoms, results from a free electron gas were turned into a Thomas-Fermi atom, perhaps the first “density functional” approach to the single electron problem. Hohenberg and Kohn proved an important theorem relating the ground state energy of any electronic system to a functional of the density, and Kohn and Sham turned it into a single electron mean field approach — basically a Hartree atom (or molecule) but with a density functional single electron potential. And for at least a decade or a decade and a half now, people have worked on refining the density functional approach semi-empirically to the point where it can do a decent job of computing quite extensive electronic systems.
This is the way theoretical physics is supposed to work for non-computable or difficult to compute problems, problems where we know we cannot precisely represent the answer and where the answer contains short-range singular complexity and long range nonlinearity in the answer. I could run down a partial list of computational methods that implement these general ideas in e.g. quantum chemistry — the “LCAO” (linear combination of atomic orbitals), the use of Gaussians instead of atomic orbitals (easier to compute), self-consistent field methods — but suffice it to say that there are many implementations of the computational methodology for any given approach to the problem, and there are MANY different approaches to all of the quantum electron problems where EVERYBODY KNOWS exactly how to write down the Hamiltonian for the system — it is “well-known physics”. The same is true all the way down to band theory (where I spent a fair bit of time developing my own approach, a nominally exact (formally convergent) solution to the single electron problem in a crystal potential) with is still an extension of the humble hydrogen atom, which is the ONLY quantum electronic atomic problem we can nominally solve exactly.
None of these people are trying to play political games with the results — the ability to do these computations is just plain useful — a key step along the path to reducing the engineering problem to 80% (comparatively cheap) computation and only 20% empirical testing instead of engaging in an endless Edisonian search for useful structures unguided by any sort of idea of what you are looking for. And so it is with climate — it is difficult to deny that being able to predict the weather out far enough that it becomes the “climate” instead would permit long range planning that would be just as useful as Pharaoh’s dream, just as the ability to predict hurricane tracks and intensities and predict the general weather days to weeks in advance is important and valuable now.
There is nothing whatsoever dishonest in the attempt — it is even a noble calling. And I do not believe that Nick Stokes or Joel Shore are in any reasonable sense of the work being dishonest as they express their reasonably well-informed opinions on this list, any more than I am. Two humans can be honest and disagree, especially about the future predictions produced by code solving a problem that is arguably more difficult by a few orders of magnitude than the quantum electronic problem.
However, the quantum electronic problem permits me to indicate to Nick in some detail the failure of his argument by contrasting the way that the results GCMs are presented to the public (and the horrendously poor use of statistics to transform their results into a “projection” that isn’t a prediction and that — apparently — cannot be falsified or compared to reality) and the way quantum electronics has been treated from the beginning by the physics and quantum chemistry communities.
First of all, you can take thirty completely distinct implementations of the Hartree approach, and apply them to (say) the problem of computing the electronic structure of Uranium. If you do, it may well be that you get a range of answers, in spite of the fact that all of the programs implement “the same physics”. They may use different integration programs or ODE solution programs with different algorithms and adaptive granularity. They may be run on machines with different precision, and since the ODEs being solved are stiff, numerical errors can accumulate quite differently. The computations can be run to different tolerances, or summed to different points in a slowly convergent expansion. The computations may use completely different bases. They may well give widely distinct result all while still basically solving a single electron, mean field problem based on known physics and neglecting unknown/non-computable stuff in a similar way! I’d say it would be more than a bit surprising if they all gave the same results.
Of course Uranium has its own, real, electronic structure:
http://education.jlab.org/itselemental/ele092.html
Except that even this picture isn’t correct — most of the energy levels in this list are (or should be) “hybridized” orbitals, which basically means that this list is a set of single electron labels in a basis that does not actually span the space where the solution lives. No single electron approach will give either the correct spectrum or the correct wavefunctions or the correct labelling because the single electron basis spans the wrong (and MUCH smaller) space compared to the one where the true wavefunction lives. But one can nevertheless measure Uranium’s electronic structure and spectrum and compare the results to the collection of computations suggested above.
So, can we assume — as Nick seems to think — that we can average over the many Hartree results and get a better (more accurate) a priori prediction of the electronic structure of Uranium? Of course not. Not only will all of the Hartree results be in error, they will all be in error in the same direction from the true spectrum. The single electron/Hartree approach ignores both the electron correlation hole and Pauli exhange. Both of these increase the effective short range repulsion and result in an atom that is strictly larger and less strongly bound than the Hartree atom. The correlation/exchange interaction, if included, will strictly increase the spectral energies computed by Hartree. Even if you manage to write code that works perfectly and gives you the Hartree energies to six significant digits, even if you fix all of the Hartree computations so that they agree to six significant digits, four or five out of those six significant digits will be wrong, systematically too low.
Now suppose that you consider a collection of codes, some of which are Hartree, some of which are Hartree-Fock (and include exchange more or less exactly, although that is probably still impossible for Uranium), some of which are Thomas-Fermi or Thomas-Fermi-Dirac, some of which are density functional using programs of different ages and lines of research descent). For the record, Hartree will always underestimate energies by quite a bit, Hartree-Fock will (IIRC) always underestimate energies but not by so much as the exchange hole accounts for part of the correlation hole present in Hartree without any density functional piece, and as one inserts various ab initio or semi-empirical Kohn-Sham density functionals, one can get errors of either sign especially from the latter which is retuned to give the right answers in different contexts, more or less recognizing that we cannot a priori derive the “exact” Kohn-Sham density functional that will work for all electronic configurations across all energy ranges and length scales.
Is there any possible way that one “should” average the output from all of the different codes in the expectation that one will get an improved prediction? If you’ve paid any attention at all, you will know that the answer is no. The inclusion of Hartree and Hartree-Fock results will result in a systematic error with a monotonic deviation from the correct (empirical) answers, while using a well-tuned semi-empirical potential will give you answers that are very close to the correct answers and may well make errors of either sign relative to the correct answers.
This counterexample makes it rather clear that there can be no possible theoretical justification for averaging the results of many independently written models as an “improvement” on those models. There are only correct models, that give good results all by themselves, and incorrect models, that don’t. And the only way to tell the difference is to compare the results of those models to experiment. Not to each other. Not to other experiments — e.g. discovering that Hartree works quite well for hydrogen and helium and isn’t too bad (with a small monotonic deviation) for light atoms, so surely it is valid for Uranium too! Again, this is obvious from a simple comparison of the results of fully converged codes — the answers they produce are all different. They cannot possibly all be correct.
In the case of quantum electronics we are fortunate. It is a comparatively simple system, one where I can write down the equation from which the solution must proceed, or link it e.g. here:
http://en.wikipedia.org/wiki/Density_functional_theory#Derivation_and_formalism
It is also one where the sign of
Nick asserts — and again, I’m sure he truly believes — that increasing CO_2 in the atmosphere will result in some measure of average global warming, because there is a simple, intuitive physical argument that supports this conclusion. It is his belief that this mechanism is correctly captured in the GCMs, in spite of the fact that the many different GCMs, which balance the opposing contributions of CO_2, water vapor/cloud albedo feedback, and aerosols all quite differently, lead to different predictions of the amount of warming likely to be produced by 600 ppm CO_2, and in spite of the fact that we cannot quantitatively or qualitatively explain the observed natural variation in global climate over as little as 1000 years, let alone 10,000 or 10,000,000. In other words, we cannot even predict or explain the gross movements of the climate over a time that CO_2 was not — supposedly — varying. Nick has directly stated that perhaps the LIA was caused by a Maunder minimum — and of course he could be right about that, although correlation is not causality.
What he cannot do is explain why the Maunder minimum would have caused a near-ice age at that time. The variation of solar forcing “should” have been far too small to produce such a profound effect (the coldest single stretch in 9000 years, as far as we can tell today) and there isn’t any good reason to think that Maunder type minima don’t happen regularly every few centuries, so why isn’t the Holocene punctuated with recurrent LIAs? This, in turn, leads one to speculation about possible mechanisms, all of which would constitute omitted physics. Because while there is some argument about whether or not the latter 20th century was a Grand Solar Maximum, it was certainly very active, and if global temperatures respond nonlinearly to solar variation, or respond through multiple mechanisms (some of which have been proposed, some of which may exist but not yet been proposed), that constitutes omitted physics in the GCMs that almost by definition is significant physics if it can produce the LIA and perhaps produce the MWP on the flip side of the coin, independent of CO_2.
So while I agree with him that the argument that increasing CO_2 will increase global average temperature is a compelling one, I do not agree that we know how much GAST will increase, or how it will increase, or where it will increases. I do not agree that we know with anything like certainty what the feedbacks are that might augment it or partially cancel it — or even what the collective average sign of those feedbacks are (granted that it might partially augment temperatures in some places and partially reduce them in others, compared to a baseline of only CO_2 based warming). I do not have any confidence at all that the GCMs have the physics right because when I compare them to reality precisely the same way I might compare the results of a quantum structure computation to a set of spectral measurements I find that they seem to be making a systematic error, an error that is in some cases substantial.
Indeed, when I compare the “fully converged” (extensively sampled) predictions of the GCMs to each other I find that they do not agree. They differ substantially in their prediction of total warming at 600 ppm CO_2. Some GCMs produce only around 1.5 C total warming — basically unenhanced CO_2 based warming. Some produce over twice that, down from still earlier models that produced over three and a half times that. As I hope I’ve fairly conclusively shown, all that this shows is that most of the GCMs are definitely wrong simply because they disagree. If I say 3.5 C, and you say 1.5 C, we cannot both be right.
There is precisely one standard for “rightness” in the context of science. It is the same whether or not one is considering the predictions of quantum electronic structure computations or the predictions of the far less likely to be reliable predictions of GCMs. That is to take the predictions and compare them to reality, and use the comparison to rank the models from least likely to be correct (given the evidence so far) to most likely to be correct (given the evidence so far) as well as to identify likely systematic errors in even the most successful models where they systematically deviate from observation.
I really don’t see how anyone could argue with this. A model in CMIP5 that predicts almost 0.6C more warming than has been observed over a mere 20 years (en route to over 3.5 C total warming over the 21st century) has to be less likely to be correct than a model that is in less substantial disagreement with observation, quite independent of what your prior beliefs concerning the models in question were. Including it in a summary for policy makers on an equal footing with models that are in far less substantial disagreement with reality, and without any comment intended to draw that fact to the attention of the policy makers, using that model as part of what amounts to a simple average predicting “most likely” future warming — those aren’t the sins of the GCMs themselves, those are the sins of those who wrote the AR5 SPM, and that without question is highly dishonest.
Or incompetent. It’s difficult to say which.
rgb
co Brian H says:
November 24, 2013 at 12:45 am
rgb;
It is said, a man with one watch knows the time. A man with 2 watches is uncertain. 😉
And a damn good saying it is.
In my physics class, we compute the period of a “Grandfather clock” that uses a long rod with a mass at one end. In fact, we compute it several ways. First, we neglect the mass of the rod and treat the mass at the end (the “pendulum bob”) as a point mass. This gives us a period that we could use to predict the future time as read off that clock. However, this underestimates the moment of inertia of the pendulum bob! The clock will run systematically slow. One has a perfectly good physical model for an estimate of the period, but it made a systematic error by neglecting a piece of physics that turns out to be important, depending on the relative dimensions of the rod and pendulum bob.
But wait! That isn’t right either! We have to include the mass of the rod! That too has a positive contribution to the moment of inertia, but it also makes a contribution to the driving torque! Suddenly, the period of the clock depends on the relative masses of rod and pendulum bob, the length of the rod, and the radius of the pendulum bob, in a nontrivial way! One can end up with a period that is too long or too short, so that a correction made that failed to take the rod into account in detail could have the wrong sign, and indeed particular values for m, M, L and R (as masses and dimensions of rod and a disk-shaped bob centered at L, for example) might lead to the exact same period as the uncorrected simple pendulum.
Now try to correct for the fact that the clock sits in my foyer, and in the afternoon the sun shines in and warms it, lengthening the rod and expanding the pendulum bob, and ever night I turn the thermostat down so that the hall gets cold, but of course on warm nights it doesn’t. Try to correct for the nonlinearity of the spring that restores energy to the pendulum bob against friction and damping. Try to correct for the thermal properties of the spring! All in a predictive model. Eventually it becomes a lot easier to just watch the clock and see what it does, and perhaps try to explain it a posteriori. Physics in many complex systems suffices for us to understand in general what makes the grandfather clock tick, and even helps us to understand the sign and rough contribution of many of the corrections we might layer on beginning with the simplest theory that explains the general behavior, but doesn’t turn out to anticipate the effect of the gradual oxidation of the oil in the gears and the rate that they gum up as ambient dust and metal wear are ground into the clock, or the effect that moving the clock across the room when redecorating has on the assumption of afternoon sunshine, or the effect of tipping it back against the wall with coasters to make sure that it cannot be pulled over onto grandchildren heads. We inevitably idealize in countless ways in physics computations, which is why good engineering starts with the results obtained from as good a model as one can afford, but then builds prototypes and tests them, and refines the design based on ongoing observations rather than betting a billion dollar investment on the prediction of a model program with no intermediate science needed.
The only place I know of where the latter is done is in the engineering of nuclear weapons, post the test-ban treaty. At that point, the US (and USSR, and maybe a few other players) felt that they had enough data to be able to do computational simulations of novel nuclear devices well enough to be able to build them without testing them, and to prevent others who might get the data from being able to do so as well, they made it a federal crime to export a supercomputer capable of doing the computations. Until, of course, ordinary desktop computers got fast enough to do the computations (oops, pesky Moore’s Law:-), first in beowulf clusters and then in single chassis machines. My cell phone could probably do the computations at this point — my kid’s PS3 playstation would have been classified as a munition twenty years ago. But even there, if anyone comes up with a design outside of the range represented by the data they took and the corresponding theory, they run the risk of discovering the hard way that their design, however carefully simulated, is incorrect.
The Bikini test was one example of what happens then. You build what you think is a 5 MT nuclear device and it turns out to be 15 MT and almost kills your observers (and doses all sorts of people with radiation that you didn’t intend to dose). Oops.
rgb
RGB:
I will continue your warning (about clocks, accuracy, and models) with the following “real world” clock problems and “acceptance by recognized Royal Authority and testing” …
Your “real” pendulum must take into effect the friction of the bearings holding that pendulum, metal-metal sliding of the clock “whiskers” and “locks” as they alternately stop and stop motion of the gears, air friction of the pendulum, etc.
What i see happening is that the GCM models do “try” – but they “try” by attacking minutia – such as, in your example, the changing air friction of the pendulum due to changing air density (air density being assumed proportional to air temperature) but not that due to changing air pressure from a cold front coming through or from summer humidity to winter dry cold air changing – not the air density but the wood length and weight of the cabinet and clock foundation alignment! Or taking into account the air density and brass thermal expansion, but ignoring the degradation of grease over time but accounting for laboratory temperature ( but assigning it an assumed constant room temperature at a constant elevation above sea level and thus atmospheric pressure!)
And yet the average of all the GCM’s (Global Clock Models) is claimed more accurate when every Global Clock Model uses a different fudge factor for net clock friction and clock environment.
In our real world, the Royal Astronomer charged by decree by the Royal Navy with approving the first chronometer had a competing – VERY LUCRATIVE! – “theory” of using lunar observations (and their lucrative hundreds of Naval Observatory clerks making lunar observation books and tables for the Royal Navy plus the Royal Navy many thousand pound award itself).
In the real world, this “Royal Astronomer” took apart the first working chronometer, re-assembled it badly and by untrained/uncaring maechanics – not the owner/inventor/builder himself, placed in the chronometer in ever-changing sunshine for months as he mis-used it and mis-wound it and abused it. He tested that chronometer with false conditions and false assumptions, and refused all efforts for an unbiased test with unbiased observers in real world conditions of actual sea voyages being used by real-world but trained navigators. That is, “real” navigators DO take care of their instruments and DO take adequate precautions to avioid deliberate errors and malpractice. On the other hand, they DO make actual errors, but are responsible in noting those errors and making corrections since their lives are actually at stakes. (An impartial observer will see the relationship between – for example – UAH satellite data taking and data tracking and temperature processing corrections being in the open and errors being rapidly acknowledged and corrected, and NASA-GISS/East Anglia/Mann’s data being deliebrately hidden and manipulated!)
The first working chronometer was not really the elaborate first or second model he built (both based on large heavy clockwork assemblies in large boxes and heavy frames) but rather the very, very tiny very. very lightweight “watch” that we know today. Very small means EVERY factor resisting accuracy is made small enough by extreme fabricating every part with enough care and precision that each little factor can be “neglected” in practice. Thus, for example, “grease” and oil isn’t needed because parts are light enough and clean enough that bearings do not need to be lubricated by either. Light weight and very small parts means that friction between restraints and locking parts is also minimized.
Much like the rocket+fuel+structure problem in reverse (a bigger rocket to carry more payload needs a bigger heavier structure to carry more fuel that needs a bigger tank that requires a heavier structural weight that requires more fuel that requires a bigger tank that ….) a smaller, lighter “watch” meant a more accurate chronometer!
Getting that chronometer accepted by the Royal Navy so it could be used also meant that “Royal Authority” and “consensus science” had to be not only ignored, but publicly attacked with publicity and information and accurate knowledge!
rgbatduke says:
November 25, 2013 at 9:21 am [ … ]
Which explains why engineers are every bit as necessary to our current understanding of the world as phycisists and scientists. Maybe even more so.
Empirical knowledge is absolutely necessary in ‘climate studies’. Notice that the alarmist crowd relies primarily on computer models and peer reviewed papers — while skeptics pay attention to what the real world is telling us. They are often not the same thing at all.
The Climate COn is the biggest scam i’ve seen on all my 6 decades, how long do they think they can carry on with this crime, for that is what is truly is.
“….those aren’t the sins of the GCMs themselves, those are the sins of those who wrote the AR5 SPM, and that without question is highly dishonest.”
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You’re an expert and very fair/generous in your explanation and characterization. I’s notable that GCMs are all over estimating warming as far as I can tell. All of them. What’s the probability? What would an Old School pit-boss in Vegas do?
“….those aren’t the sins of the GCMs themselves, those are the sins of those who wrote the AR5 SPM, and that without question is highly dishonest.”
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You’re an expert and very fair/generous in your explanation and characterization. I’s notable that GCMs are all over estimating warming as far as I can tell. All of them. What’s the probability? What would an Old School pit-boss in Vegas do?
The probability question is backwards. All of them are overestimating the warming, yes. However, predicting the climate isn’t deterministic (as Nick points out above). A perfectly correct model could predict more warming than has been observed, and many of the models do predict as little warming as has been observed when they are run many, many times with small/random perturbations of their initial conditions. The question is, for a given model, how likely is it that the model is correct and we have as little warming as has been observed.
For many of the models, that probability — called the p-value, the probability of getting the data given the null hypothesis “this model is a perfect model of climate” — is very low. In one variation of ordinary hypothesis testing, this motivates rejecting the null hypothesis, that is, concluding that the model is not a perfect model (since one has no choice about the data end of things).
However, statistics per se does not permit you to reject all of the models just because some of the models fail a hypothesis test. On the other hand, common sense tells you that if 30+ models all written with substantial overlap in their underlying physics and solution methodology provide you with 30+ chances to get decent overlap and hence a not completely terrible p-value, you have a good chance of getting a decent result even from a failed model. The possibility of data dredging rears its ugly head, and one has to strengthen the criterion for rejection (reject more aggressively) to account for multiple chances. You also might find grounds to reject the models by examining quantities other than GASTA. A failed model might get e.g. rainfall patterns completely wrong and lower troposphere temperatures completely wrong (wrong enough to soundly reject the null hypothesis “this is a perfectly correct model” for EACH quantity, making the collective p-value even lower) but still (by chance) not get a low enough p-value to warrant rejection considering GASTA alone.
Finally, there is the “Joe the cab driver” correction — which is what you are proposing. This is a term drawn from Nicholas Nassim Taleb’s book The Black Swan, where he has Joe the cab driver and a Ph.D. statistician analyze the (paraphrased) question: “given that 30 flips of a two sided coin have turned up heads, what is the probability that the next flip is heads”. The (frequentist) statistician gives the book answer for Bernoulli trials — each flip is independent, it’s a two sided coin, so 0.5 of the time it should be heads independent of its past history of flips and after ENOUGH flips it will EVENTUALLY balance out.
Joe, however, is an intuitive Bayesian. He doesn’t enter his analysis with overwhelming prior belief in unbiased two-sided coins, so even though he completely understands the Ph.D.’s argument, he uses posterior probability to correct his original prior belief that the coin was unbiased and replies: “It’s a mug’s game. The coin is fixed. You can flip it forever and it will usually come out heads.” Joe, you see, has much sad experience of people who fix supposedly “random” events like coin flips, poker games, dice games, and horse races so that they become mug’s games. In Joe’s experience, the probability of 30 flips coming up heads is 1 in 2^30, call it one in a billion. The probability of encountering a human who wants to play a mug’s game with you is much, much higher — especially if you are approached by a stranger on the street who proposes the game to you (who comes up to you and says things like “given that 30 flips…”?) — maybe as high as 100 to 1. So to Joe, it’s 99.9999% likely that the coin is fixed (you’ve encountered a person who cheats) compared to encountering an unbiased coin just at the end of a 30-head sequence in a long Bernoulli trial.
Is climate currently a mug’s game? Not necessarily. Here you have to balance three things:
a) The climate models are all generally right, but the climate happens to be following a consistent but comparatively unlikely trajectory at the moment, on a run of La Nina-like neutral-to-cooling Bernoulli-trial events (as Nick has proposed).
b) It’s a mug’s game, and the GCMs are written to deliberately overestimate warming so that the nefarious Ph.D.s all keep their funding, so that Al Gore gets rich on climate futures, so that the UN maintains cash flow in a new kind of “tax” to transfer money from the richer nations to the poorer ones (taking graft at all points along the way, why not, no limit to the cynicism you can throw into a mug’s game).
c) The climate models are honestly written by competent Ph.D.s who are trying to do their best and really do believe that their models are correct and the climate is accidentally not cooperating a la a) above, but those scientists are simply wrong, and the best-of-faith models are not correct.
These are not even mutually exclusive alternative hypotheses. Some models could be correct and honestly written, and the climate could be running cooler than it “should” in the sense of somehow averaging over many nearly identically started Earth’s, and it could also be a mug’s game with some of the models written to deliberately exaggerate warming, and the IPCC could well be exploiting those models (and deliberately including them) to line many pockets with carbon taxes even though they are nowhere near as certain of CAGW as they claim, and some of the climate models could be written by honest enough scientists but just happen to do the physics badly — not pursue the computations to a fine enough spatial granularity, for example, or omit a mechanism like the proposed GCR-albedo link that turns out to be crucial all at the same time.
Lots of models, after all, and many people involved in the part that could be a kind of a mug’s game, and not all of them are actual code authors or principle investigators in charge of developing and applying a given GCM.
Accusing people of criminal malfeasance — bad faith, deliberate manipulation of presented information to accomplish an evil purpose (or even a “good” purpose but based on deliberate lies and misrepresentation of the truth) — is serious business. It is something that both sides in this debate do far too freely. Warmists accuse deniers of being stupid and/or in the pay and pockets of the Evil Empire of Energy, where to them “renewable” energy and “green” energy somehow stands for energy that belongs “to the people” instead of being developed, implemented, sold and delivered by exactly the same people they are accusing the deniers of being funded by. Deniers accuse warmists of deliberately cooking up elaborate computer programs designed to show warming for the sole purpose of separating fools from money and at the same time accomplish some sort of green-communist revolution, removing the means of energy production from the rightful hands of the capitalists (instead of nothing that all of the alternative energy sources are being developed, implemented, solid and delivered by exactly the same capitalists that they might think are being shut out). Everybody accuses everybody else of bad faith, lies, coercion, bribery, unelected social revolution, exaggeration, extremism.
This is sadly typical of religious and political disputes, but it has no place in science. That doesn’t mean that scientists need to sit by mute while a mug’s game is played out, but it does mean that scientists need to be very cautious about impugning the motives or honesty of the proponents of a point of view and focus on the objective issues, such as whether or not the point of view is well or poorly supported by the data and our general understanding of science and statistics.
At the moment, one thing that is pretty clear to me is that the current GASTA trajectory is not good empirical support for the correctness of most of the GCMs in CMIP5. One probably cannot reject them all on the basis of
a failed hypothesis test, but one probably should reject some of them, at least unless and until the climate takes a jump back up to where they are not in egregious disagreement by as much as 0.6 C over a mere 20 year baseline. I will even go out on a limb and state clearly that in my opinion, removing the words from the AR5 SPM that more or less implied that and redrawing the figure in such a way as to conceal that and convey the idea that the mean behavior of many GCMs is somehow a valid statistical predictor of future climate was either rank incompetence and misuse of statistics or probable evidence that there is a mug’s game being played here. The latter is absolutely indefensible on the basis of the theory of statistics (and using such things as a “mean of many GCMs” or the “standard deviation of many GCMs” (predicting any given quantity) to compute confidence intervals of a supposed future climate is literally beyond words).
But that doesn’t mean that even one of the GCMs themselves are written in bad faith. It just means that some of them are very probably wrong, and that concealing that fact from policy makers after acknowledging it in an earlier draft is, yes, quite dishonest.
rgb
“But that doesn’t mean that even one of the GCMs themselves are written in bad faith.”
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.
Thanks very much for your thoughtful, expert explanation.