Guest Post by Willis Eschenbach
[UPDATE]: I have added a discussion of the size of the model error at the end of this post.
Over at Judith Curry’s climate blog, the NASA climate scientist Dr. Andrew Lacis has been providing some comments. He was asked:
Please provide 5- 10 recent ‘proof points’ which you would draw to our attention as demonstrations that your sophisticated climate models are actually modelling the Earth’s climate accurately.
To this he replied (emphasis mine),
Of note is the paper by Hansen, J., A. Lacis, R. Ruedy, and Mki. Sato, 1992: Potential climate impact of Mount Pinatubo eruption. Geophys. Res. Lett., 19, 215-218, which is downloadable from the GISS webpage.
It contains their model’s prediction of the response to Pinatubo’s eruption, a prediction done only a few months after the eruption occurred in June of 1991:
Figure 1. Predictions by NASA GISS scientists of the effect of Mt. Pinatubo on global temperatures. Scenario “B” was Hansen’s “business as usual” scenario. “El” is the estimated effect of a volcano the size of El Chichón. “2*El” is a volcano twice the size of Chichón. The modelers assumed the volcano would be 1.7 times the size of El Chichón. Photo is of Pinatubo before the eruption.
Excellent, sez’ I, we have an actual testable prediction from the GISS model. And it should be a good one if the model is good, because they weren’t just guessing about inputs. They were using early estimates of aerosol depth that were based on post-eruption observations. But with GISS, you never know …
Here’s Lacis again talking about how the real-world outcome validated the model results. (Does anyone else find this an odd first choice when asked for evidence that climate models work? It is a 20-year-old study by Lacis. Is this his best evidence he has?) But I digress … Lacis says further about the matter:
There we make an actual global climate prediction (global cooling by about 0.5 C 12-18 months following the June 1991 Pinatubo volcanic eruption, followed by a return to the normal rate of global warming after about three years), based on climate model calculations using preliminary estimates of the volcanic aerosol optical depth. These predictions were all confirmed by subsequent measurements of global temperature changes, including the warming of the stratosphere by a couple of degrees due to the volcanic aerosol.
As always, the first step in this procedure is to digitize their data. I use a commercial digitizing software called “GraphClick” on my Mac, there are equivalent programs for the PC, it’s boring tedious hand work. I have made the digitized data available here as an Excel worksheet.
Being the untrusting fellow that I am, I graphed up the actual temperatures for that time from the GISS website. Figure 2 shows that result, along with the annual averages of their Pinatubo prediction (shown in detail below in Figure 3), at the same scale that they used.
Figure 2. Comparison of annual predictions with annual observations. Upper panel is Figure 2(b) from the GISS prediction paper, lower is my emulation from digitized data. Note that prior to 1977 the modern version of the GISS temperature data diverges from the 1992 version of the temperature data. I have used an anomaly of 1990 = 0.35 for the modern GISS data in order to agree with the old GISS version at the start of the prediction period. All other data is as in the original GISS prediction. Pinatubo prediction (blue line) is an annual average of their Figure 3 monthly results.
Again from their paper:
Figure 2 shows the effect of E1 and 2*El aerosol son simulated global mean temperature. Aerosol cooling is too small to prevent 1991 from being one of the warmest years this century, because of the small initial forcing and the thermal inertia of the climate system. However, dramatic cooling occurs by 1992, about 0.5°C in the 2*El case. The latter cooling is about 3 σ [sigma], where σ is the interannual standard deviation of observed global annual-mean temperature.This contrasts with the 1-1/2 σ coolings computed for the Agung (1963)and El Chichon (1982) volcanos
So their model predicted a large event, a “three-sigma” cooling from Pinatubo.
But despite their prediction, it didn’t turn out like that at all. Look at the red line above showing the actual temperature change. If you didn’t know there was a volcano in 1991, that part of the temperature record wouldn’t even catch your eye. Pinatubo did not cause anywhere near the maximum temperature swing predicted by the GISS model. It was not a three-sigma event, just another day in the planetary life.
The paper also gave the monthly predicted reaction to the eruption. Figure 3 shows detailed results, month by month, for their estimate and the observations.
Figure 3. GISS observational temperature dataset, along with model predictions both with and without Pinatubo eruptions. Upper panel is from GISS model paper, lower is my emulation. Scenario B does not contain Pinatubo. Scenario P1 started a bit earlier than P2, to see if the random fluctuations of the model affected the result (it didn’t). Averages are 17-month Gaussian averages. Observational (GISS) temperatures are adjusted so that the 1990 temperature average is equal to the 1990 Scenario B average (pre-eruption conditions). Photo Source
One possibility for the model prediction being so far off would be if Pinatubo didn’t turn out to be as strong as the modelers expected. Their paper was based on very early information, three months after the event, viz:
The P experiments have the same time dependence of global optical depth as the E1 and 2*El experiments, but with r 1.7 times larger than in E1 and the aerosol geographical distribution modified as described below. These changes crudely account for information on Pinatubo provided at an interagency meeting in Washington D.C. on September 11 organized by Lou Walter and Miriam Baltuck of NASA, including aerosol optical depths estimated by Larry Stowe from satellite imagery.
However, their estimates seem to have been quite accurate. The aerosols continued unabated at high levels for months. Optical depth increased by a factor of 1.7 for the first ten months after the eruption. I find this (paywall)
Dutton, E. G., and J. R. Christy, Solar radiative forcing at selected locations and evidence for global lower tropospheric cooling following the eruptions of El Chichon and Pinatubo, Geophys. Res. Lett., 19, 2313-1216, 1992.
As a result of the eruption of Mt. Pinatubo (June 1991), direct solar radiation was observed to decrease by as much as 25-30% at four remote locations widely distributed in latitude. The average total aerosol optical depth for the first 10 months after the Pinatubo eruption at those sites is 1.7 times greater than that observed following the 1982 eruption of El Chichon
and from a 1995 US Geological Service study:
The Atmospheric Impact of the 1991 Mount Pinatubo Eruption ABSTRACT
The 1991 eruption of Pinatubo produced about 5 cubic kilometers of dacitic magma and may be the second largest volcanic eruption of the century. Eruption columns reached 40 kilometers in altitude and emplaced a giant umbrella cloud in the middle to lower stratosphere that injected about 17 megatons of SO2, slightly more than twice the amount yielded by the 1982 eruption of El Chichón, Mexico. The SO2 formed sulfate aerosols that produced the largest perturbation to the stratospheric aerosol layer since the eruption of Krakatau in 1883. … The large aerosol cloud caused dramatic decreases in the amount of net radiation reaching the Earth’s surface, producing a climate forcing that was two times stronger than the aerosols of El Chichón.
So the modelers were working off of accurate information when they made their predictions. Pinatubo was just as strong as they expected, perhaps stronger.
Finally, after all of that, we come to the bottom line, the real question. What was the difference in the total effect of the volcano, both in observations and in reality? What overall difference did it make to the temperature?
Looking at Fig. 3 we can see that there is a difference in more than just maximum temperature drop between model results and data. In the model results, the temperature dropped earlier than was observed. It also dropped faster than actually occurred. Finally, the temperature stayed below normal for longer in the model than in reality.
To measure the combined effect of these differences, we use the sum of the temperature variations, from before the eruption until the temperature returned to pre-eruption levels. It gives us the total effect of the eruption, in “degree-months”. One degree-month is the result of changing the global temperature one degree for one month. It is the same as lowering the temperature half a degree for two months, and so on.
It is a measure of how much the volcano changed the temperature. It is shown in Fig. 3 as the area enclosed by the horizontal colored lines and their respective average temperature data (heavier same color lines). These lines mark the departure from and return to pre-eruption conditions. The area enclosed by each of them is measured in “degree – months” (degrees vertically times months horizontally).
The observations showed that Pinatubo caused a total decrease in the global average temperature of eight degree-months. This occurred over a period of 46 months, until temperatures returned to pre-eruption levels.
The model, however, predicted twice that, sixteen degree-months of cooling. And in the model, temperatures did not return to pre-eruption conditions for 63 months. So that’s the bottom line at the end of the story — the model predicted twice the actual total cooling, and predicted it would take fifty percent longer to recovery than actually happened … bad model, no cookies.
Now, there may be an explanation for that poor performance that I’m not seeing. If so, I invite Dr. Lacis or anyone else to point it out to me. Absent any explanation to the contrary, I would say that if this is his evidence for the accuracy of the models, it is an absolute … that it is a perfect … well, upon further reflection let me just say that I think the study and prediction is absolutely perfect evidence regarding the accuracy of the models, and I thank Dr. Lacis for bringing it to my attention.
[UPDATE] A number of the commenters have said that the Pinatubo prediction wasn’t all that wrong and that the model didn’t miss the mark by all that much. Here’s why that is not correct.
Hansen predicted what is called a “three sigma” event. He got about a two sigma event (2.07 sigma). “Sigma” is a measure of how common it is for something to occur. However, it is far from linear.
A two sigma event is pretty common. It occurs about one time in twenty. So in a dataset the size of GISSTEMP (130 years) we would expect to find somewhere around 130/20 = six or seven two sigma interannual temperature changes. These are the biggest of the inter-annual temperature swings. And in fact, there are eight two-sigma temperature swings in the GISSTEMP data.
A three sigma event, on the other hand, is much, much rarer. It is a one in a thousand event. The biggest inter-annual change in the record is 2.7 sigma. There’s not a single three sigma year in the entire dataset. Nor would we expect one in a 130 year record.
So Hansen was not just making a prediction of something usual. He was making a prediction that we would see a temperature drop never before seen, a once in a thousand year drop.
Why is this important? Remember that Lacis is advancing this result as a reason to believe in climate models.
Now, suppose someone went around saying his climate model was predicting a “thousand-year flood”, the huge kind of millennial flood never before seen in people’s lifetimes. Suppose further that people believed him, and spent lots of money building huge levees to protect their homes and cities and jacking up their houses above predicted flood levels.
And finally, suppose the flood turned out to be the usual kind, the floods that we get every 20 years or so.
After that, do you think the flood guy should go around citing that prediction as evidence that his model can be trusted?
But heck, this is climate science …
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.



Willis —
I don’t get what the problem is. Seems to me that what you want the model to do is know precisely what compounds were emitted and in what volumes. From there as per George E.P. Box (all models are wrong but some are useful) the model shows that it’s useful. Did it get the eruption perfect? No. Would I expect it to? No. Dr Lacis says the signs were right which I take as being on the right track.
Stevo and others —
I really sorta doubt that NASA sat around and pronounced the model “good enough” and indeed set about to doing tweaks and hacks and whatnot so as to be better than what it was. From what I can tell from what I read, models are always in a state of development. This should be no different.
Willis —
I’m all for this sort of examination, but I’m hesitant to pronounce a model as useless. In fact I’d like to see about 3x more money spent on these things (as per the Judith Curry site threads) properly validated etc so that we can determine whether or not man is having an adverse effect as claimed. The effort to understand climate is going to require a lot of modeling effort one way or another. This examination, if it really is pointing out weakness as claimed, is helpful to the overall effort. The question is whether or not this has already been superceded by NASA and/or other shops. Any idea?
Johnston, Jason Scott, and Robert G. Fuller, Jr. 2010. Global Warming Advocacy Science: a Cross Examination. Research Paper. University Of Pennsylvania: University Of Pennsylvania Law School, May.
http://www.probeinternational.org/UPennCross.pdf
Global Circulation Model Parameters conflict, addressing tinkering with the parameters (esp. aerosols) to force the models to simulate 20th century “observations”. Since the tinkering is unique to each model, there is no way to tell which “climate sensitivity parameter” is most accurate.
Mount Pinatubo released 20 Million metric tons of SO2 in 1991. That amount is less than China’s annual release of SO2 going back ten years.
China currently has an aggressive desulphurization program. I wonder what Dr. Lacis prediction is of the warming impact should the program prove successful.
Apparently the EPA wishes to assure Global Warming will continue for the indefinite future (together with their jobs and power). On the one hand, they wish to reduce CO2 emissions to forestall AGW. On the other hand, they wish to reduce SO2 emissions which reduce global warming.
EPA. 2009. SO2 Reductions and Allowance Trading under the Acid Rain Program. Governmental. Clean Air Markets. April 14. http://www.epa.gov/airmarkets/progsregs/arp/s02.html
The link above provides an overview of how reductions in SO2 emissions are to be achieved under the Acid Rain Program. Below is a report of the EPA doubling down on Command-And-Control in 2010.
Houston wrote: “the aerosol ‘knob’ needs to be turned down a bit.”
Yes, and the “aerosol knob” is mostly the climate sensitivity knob. They multiply any warming or cooling effect by water vapor feedbacks. Evidently a lot less of that multiplying happens than they assume.
Willis, thanks for another great post. Statistics is largely a mystery to me, but is it possible to adjust other parameters (such as CO2 sensitivity) in Lacis’ model, to see if doing so produces a better fit?
randomengineer says:
December 29, 2010 at 10:48 pm
Again, my point is not that this model is very poor. It is that Dr. Lacis is touting the results of this prediction as evidence that we should believe in the models.
But the GISS model predicted a three sigma cooling from Pinatubo, thats literally a one year in a millennium temperature drop. Three sigma is p less than .001, after all. They predicted that we would cool, in two short years, all the way back to the 1958 temperature. That is a pretty clear and unequivocal prediction.
But temperatures didn’t even drop as much as in ’73-’76, when there were no volcanoes. They predicted a massive cooling and it was a non-event. Truly, look at the red line in Fig. 2. There was nothing out of the ordinary. How wrong does a prediction have to be for you folks to get it??? It was another in the unending list of disaster predictions, only this time it was an actual verifiable prediction, and it was an actual verifiable bust. No once in a thousand years cooling. Didn’t happen.
Perhaps that is some kind of acceptable error for you. For me, when Dr. Lacis says that those results are a reason to believe in models, I find them a reason to laugh.
w.
Alec Rawls says:
December 29, 2010 at 11:32 pm
Houston wrote: “the aerosol ‘knob’ needs to be turned down a bit.”
……..
When discussing models, parametisation and ‘knob turning’ with my students one of them commented, ” a bit like etch a sketch”. How true.
Willis: please quote the uncertainty on the number that the model predicted, and the uncertainty on the number you determined from the observations.
randomengineer says:
December 29, 2010 at 10:48 pm
Willis –
I don’t get what the problem is. Seems to me that what you want the model to do is know precisely what compounds were emitted and in what volumes. From there as per George E.P. Box (all models are wrong but some are useful) the model shows that it’s useful. Did it get the eruption perfect? No. Would I expect it to? No. Dr Lacis says the signs were right which I take as being on the right track.
——————–
Sure would like to know where you work random… I want to avoid products and services from your company at all costs!
If I gave a prediction or forecast to my company that was off at its peak by 100% they’d show me the door.
Here’s the crux: if the model was off 100% when modeling a fairly well understood (and relatively rare) climatic event, how far off will it be in modeling far more complicated and nuanced events? Two hundred percent? Three hundred percent? More? Less?
Here’s a simple test of the voracity of your trust in the model: run the model today and have it predict the average global temperature for 2013. Will you bet a years pay that the model will be correct?
KD, PhD Engineer and former modeler
Stevo says:
December 30, 2010 at 4:39 am
Willis: please quote the uncertainty on the number that the model predicted, and the uncertainty on the number you determined from the observations.
———–
I think you just reinforced Willis’ point.
If the model and observation errors mean the model was NOT statistically different from the observation, then the whole of the AGW worry is within the noise.
If the model and observation ARE statistically different, then the model is, well, crap.
In either case, before the world makes trillion dollar policies I think we need a whole lot more confidence in our understanding of the climate and man’s impact (or lack of impact) on it.
I agree – “Prediction is hard, especially of the future.”
Much easier to predict the past.
[grin]
Clearly what’s needed here is a trick to hide the lack of decline.
“”””” JohnWho says:
December 30, 2010 at 8:01 am
I agree – “Prediction is hard, especially of the future.”
Much easier to predict the past. “””””
Well unfortunately, climate models don’t even predict the past. Surely if the models were any good, they would be able to recalculate the missing climate data that the EA CRU couldn’t fnd space to store, so they trashed it.
Well since they completely ignore the Nyquist Sampling Theorem, in their data gathering; there is no way they can even reconstruct the past; they can’t even correctly recover the averages.
(Dr) KD — Here’s the crux: if the model was off 100% when modeling a fairly well understood (and relatively rare) climatic event, how far off will it be in modeling far more complicated and nuanced events? Two hundred percent? Three hundred percent? More? Less?
Couple of points:
1. I’m not starting with a presumption that Dr Lacis input exact values of the eruption volume, just that there was an eruption. If he/they input exact counts then obviously the model ought to have performed better. How much better? I don’t know. The question is whether or not the eruption causes it to be off 5 years later.
2. If the point of the model is to be able to precisely replicate eruptions, there’s an obvious problem. It seems that the point of this model is to have some sense of how eruptions impact global climate drivers, in which case short term eruption recovery precision may not be a big factor for longer term usability.
I don’t know how you got a PhD assuming that all models are exact for all possible inputs. Heaven knows that we in R&D engineering can sometimes be wildly wrong even when performing mundane tasks like estimating development time, and depending on what you’re building, more often than not. As per Dilbert “it’s logically impossible to schedule the unknown.” Plenty of models that are good at one factor may not be at others. So? The point is whether the model is useful for the factor set it was designed for. Has anyone asked whether or not the point of the model in question was to accurately replicate all possible short term eruptions?
willis, OT.
Thought about the issue of bias in the control run.
simple: control run would then be subtracted from every projection for bias removal
have a look at this
http://sms.cam.ac.uk/media/872375?format=flv&quality=high&fetch_type=stream
Baa Humbug says:
December 29, 2010 at 8:06 am
Willis
I’m afraid your invitation, though not fallen on deaf ears, will not be accepted——
But I’m afraid your invite will never be accepted because as they say in the horse training classics.
A horse never forgets, but he forgives.
A donkey never forgets and never forgives.
You are not dealing with horses 😉‘
Well, at least not the front end of horses.
🙂
Steven Mosher says:
December 30, 2010 at 9:33 am
Thanks, Mosh. If you take a look at Fig. 3 you’ll see that during the time in question, the control run varied little from zero. As a result, there is little difference between what I show, and the (projection minus control run) that you are talking about.
All the best,
w.
randomengineer says:
December 30, 2010 at 9:27 am
They assumed that the eruption would put 1.7 times the amount of ejecta into the stratosphere as came from El Chichón. In the event it was about that, maybe nearer to twice the ejecta. So they had good numbers, although a bit smaller than in the actual eruption.
While that may be your question, it is not mine. Nor did Dr. Lacis refer to that in any form. He said that the 1992 model predictions of the response to Pinatubo shows that we can trust the models. Not what the model does five years after the eruption. He pointed us to to the predictions of the size and length of the climate effects of the eruption. So no … that’s not the question at all.
You are still not getting my point, randomengineer. At this stage of the game, the point of this particular model’s performance is simply to convince us that we should trust the models. That’s what Dr. Lacis said, that the performance of this model on Pinatubo is an exemplar. Not just another model run, but a model run that will show unbelievers that models really can be trusted.
Egads, randomengineer, I don’t know how you got your job if your reading skills are that poor. [Said for effect only, to emphasize the aggro nature of your remark].
KD noted that the model was off by 100%, and (since you had previously said a 100% error was somehow acceptable) he asked how far off a model would have to be for you to call it unacceptable. 200%? 300?
And in response, you accuse him of being undeserving of his PhD because you fantasize that he is claiming that “all models are exact for all possible inputs”?!? As far as I can tell, he never said that, nor anything remotely resembling that. He asked you a question, and rather than answer it, you attack his credentials … you sure that’s the answer you want to go with there?
I begin to despair. We’re not talking about “R&D models”. We’re talking about models that they claim are so advanced that we should spend billions of dollars based on their 100-year forecasts. What planet are you from, where people are willing to do that based on an “R&D model” with a 100% error in a fairly simple forecast?
No, no, and no. The point is not whether the model is useful for what it was designed for, that’s your R&D background speaking.
The issue is whether the model is suitable for what it is being used for. That’s a huge difference. Because right now it is being used to try to convince us that model results can be trusted.
Dr. Lacis has advanced the Pinatubo example in an attempt to establish that the model is suitable for use as a 100-year forecast machine to support billion dollar decisions.
Do you think that the Pinatubo results establish that?
w.
Willis Eschenbach says:
December 30, 2010 at 12:38 pm
randomengineer says:
December 30, 2010 at 9:27 am
—————————————
Thanks, you saved me a lot of typing with your (excellent) response to random.
Random, it may surprise you to know that I have been in corporate R&D as well as product engineering and software engineering. I hold seven patents. My work is in products you can buy today, specifically in very high tech medical imaging devices.
I can assure you the requirements for accuracy in the world where the companies I’ve worked for would contemplate investing millions of dollars is much higher than a 100% error. Again I ask, where do you work?
Great post Willis, looks like Lacis shot himself in the foot picking a 20yo model which failed to get even close to predicting the effect of the eruption on climate as his only example of the success of climate models. If that’s the best they’ve got it’s a travesty!
However, due to the deterministic chaos inherent in our climate system I would have been very surprised if they had nailed this event. Averaging climate using an arbitrary period is simply a mental construct which has no meaning in our dynamic climate system. Climate behaves like a complex driven pendulum where average behaviour and trends mean nothing – much like our financial markets.
Willis Eschenbach says:
But, as you know, the control runs do not correctly predict the time and size of El Nino and La Nina events, which are extremely sensitive to initial conditions…i.e., essentially chaotic. Given that there was a quite strong El Nino event during that time and that we know what the general effect of such strong El Nino events are on global temperatures, the value that you have obtained for the response to the eruption is almost surely an underestimate…quite possibly a fairly significant underestimate.
Not really. The reason why Mt. Pinatubo is chosen is not because there hundreds of clean tests to choose from and this one happened to turn out better than the rest but rather because such clean tests when the predictions are clearly made beforehand (and thus no arguments of tuning, no matter how misguided can be made). It is an unfortunate fact that we don’t have an ensemble of earths to experiment with.
Well, if you are going to demand very high precision, then you don’t need to do any tests. After all, different models have climate sensitivities varying by over a factor of 2, so you could just say right away, “I demand more accuracy before I am willing to do anything.” Of course, others might have a different point-of-view and say “Since the full range of model sensitivities points to the need to get off of fossil fuels, certainly before we use up anything near all the stores of coal that we have (let alone the more exotic sources like tar sands), we think we should do something now.” Large financial decisions are always made with uncertainties and using models that are imperfect. Do you think the economic modeling done to predict the impact of various tax and spending policies are better?
Essentially, only if one assigns infinite weight to the desire to burn fossil fuels unfettered and as cheaply as possible will one come to the conclusion that one should do nothing to lower our carbon emissions as long as there is any uncertainties about the effects (which means, of course, forever).
Interesting post. If I read the paper right, they assumed the volcanic effect immediately covered 0 to 30 N latitude bands at the time of eruption. My view is that this “ambitious” spread rate may have led to their over-prediction.
Well there is something that might have thrown off their calculations. From 1989 to 1992 anthropogenic SO2 emissions dropped around 15M metric tons, somewhat negating the Mt. Pinatubo SO2 injection. This happened after a 15 yr. plateau in emissions since the early ’70’s.
Joel Shore says:
December 30, 2010 at 3:18 pm
Large financial decisions are always made with uncertainties and using models that are imperfect. Do you think the economic modeling done to predict the impact of various tax and spending policies are better?
————————————————————-
OK, please name a financial decision as large as is contemplated by the AGW crowd that was made based on a model projection 100 years into the future using a model that had a 100% error when modeling an event over a five year period.
Any example will do.