Guest Post by Willis Eschenbach
[UPDATE]: I have added a discussion of the size of the model error at the end of this post.
Over at Judith Curry’s climate blog, the NASA climate scientist Dr. Andrew Lacis has been providing some comments. He was asked:
Please provide 5- 10 recent ‘proof points’ which you would draw to our attention as demonstrations that your sophisticated climate models are actually modelling the Earth’s climate accurately.
To this he replied (emphasis mine),
Of note is the paper by Hansen, J., A. Lacis, R. Ruedy, and Mki. Sato, 1992: Potential climate impact of Mount Pinatubo eruption. Geophys. Res. Lett., 19, 215-218, which is downloadable from the GISS webpage.
It contains their model’s prediction of the response to Pinatubo’s eruption, a prediction done only a few months after the eruption occurred in June of 1991:
Figure 1. Predictions by NASA GISS scientists of the effect of Mt. Pinatubo on global temperatures. Scenario “B” was Hansen’s “business as usual” scenario. “El” is the estimated effect of a volcano the size of El Chichón. “2*El” is a volcano twice the size of Chichón. The modelers assumed the volcano would be 1.7 times the size of El Chichón. Photo is of Pinatubo before the eruption.
Excellent, sez’ I, we have an actual testable prediction from the GISS model. And it should be a good one if the model is good, because they weren’t just guessing about inputs. They were using early estimates of aerosol depth that were based on post-eruption observations. But with GISS, you never know …
Here’s Lacis again talking about how the real-world outcome validated the model results. (Does anyone else find this an odd first choice when asked for evidence that climate models work? It is a 20-year-old study by Lacis. Is this his best evidence he has?) But I digress … Lacis says further about the matter:
There we make an actual global climate prediction (global cooling by about 0.5 C 12-18 months following the June 1991 Pinatubo volcanic eruption, followed by a return to the normal rate of global warming after about three years), based on climate model calculations using preliminary estimates of the volcanic aerosol optical depth. These predictions were all confirmed by subsequent measurements of global temperature changes, including the warming of the stratosphere by a couple of degrees due to the volcanic aerosol.
As always, the first step in this procedure is to digitize their data. I use a commercial digitizing software called “GraphClick” on my Mac, there are equivalent programs for the PC, it’s boring tedious hand work. I have made the digitized data available here as an Excel worksheet.
Being the untrusting fellow that I am, I graphed up the actual temperatures for that time from the GISS website. Figure 2 shows that result, along with the annual averages of their Pinatubo prediction (shown in detail below in Figure 3), at the same scale that they used.
Figure 2. Comparison of annual predictions with annual observations. Upper panel is Figure 2(b) from the GISS prediction paper, lower is my emulation from digitized data. Note that prior to 1977 the modern version of the GISS temperature data diverges from the 1992 version of the temperature data. I have used an anomaly of 1990 = 0.35 for the modern GISS data in order to agree with the old GISS version at the start of the prediction period. All other data is as in the original GISS prediction. Pinatubo prediction (blue line) is an annual average of their Figure 3 monthly results.
Again from their paper:
Figure 2 shows the effect of E1 and 2*El aerosol son simulated global mean temperature. Aerosol cooling is too small to prevent 1991 from being one of the warmest years this century, because of the small initial forcing and the thermal inertia of the climate system. However, dramatic cooling occurs by 1992, about 0.5°C in the 2*El case. The latter cooling is about 3 σ [sigma], where σ is the interannual standard deviation of observed global annual-mean temperature.This contrasts with the 1-1/2 σ coolings computed for the Agung (1963)and El Chichon (1982) volcanos
So their model predicted a large event, a “three-sigma” cooling from Pinatubo.
But despite their prediction, it didn’t turn out like that at all. Look at the red line above showing the actual temperature change. If you didn’t know there was a volcano in 1991, that part of the temperature record wouldn’t even catch your eye. Pinatubo did not cause anywhere near the maximum temperature swing predicted by the GISS model. It was not a three-sigma event, just another day in the planetary life.
The paper also gave the monthly predicted reaction to the eruption. Figure 3 shows detailed results, month by month, for their estimate and the observations.
Figure 3. GISS observational temperature dataset, along with model predictions both with and without Pinatubo eruptions. Upper panel is from GISS model paper, lower is my emulation. Scenario B does not contain Pinatubo. Scenario P1 started a bit earlier than P2, to see if the random fluctuations of the model affected the result (it didn’t). Averages are 17-month Gaussian averages. Observational (GISS) temperatures are adjusted so that the 1990 temperature average is equal to the 1990 Scenario B average (pre-eruption conditions). Photo Source
One possibility for the model prediction being so far off would be if Pinatubo didn’t turn out to be as strong as the modelers expected. Their paper was based on very early information, three months after the event, viz:
The P experiments have the same time dependence of global optical depth as the E1 and 2*El experiments, but with r 1.7 times larger than in E1 and the aerosol geographical distribution modified as described below. These changes crudely account for information on Pinatubo provided at an interagency meeting in Washington D.C. on September 11 organized by Lou Walter and Miriam Baltuck of NASA, including aerosol optical depths estimated by Larry Stowe from satellite imagery.
However, their estimates seem to have been quite accurate. The aerosols continued unabated at high levels for months. Optical depth increased by a factor of 1.7 for the first ten months after the eruption. I find this (paywall)
Dutton, E. G., and J. R. Christy, Solar radiative forcing at selected locations and evidence for global lower tropospheric cooling following the eruptions of El Chichon and Pinatubo, Geophys. Res. Lett., 19, 2313-1216, 1992.
As a result of the eruption of Mt. Pinatubo (June 1991), direct solar radiation was observed to decrease by as much as 25-30% at four remote locations widely distributed in latitude. The average total aerosol optical depth for the first 10 months after the Pinatubo eruption at those sites is 1.7 times greater than that observed following the 1982 eruption of El Chichon
and from a 1995 US Geological Service study:
The Atmospheric Impact of the 1991 Mount Pinatubo Eruption ABSTRACT
The 1991 eruption of Pinatubo produced about 5 cubic kilometers of dacitic magma and may be the second largest volcanic eruption of the century. Eruption columns reached 40 kilometers in altitude and emplaced a giant umbrella cloud in the middle to lower stratosphere that injected about 17 megatons of SO2, slightly more than twice the amount yielded by the 1982 eruption of El Chichón, Mexico. The SO2 formed sulfate aerosols that produced the largest perturbation to the stratospheric aerosol layer since the eruption of Krakatau in 1883. … The large aerosol cloud caused dramatic decreases in the amount of net radiation reaching the Earth’s surface, producing a climate forcing that was two times stronger than the aerosols of El Chichón.
So the modelers were working off of accurate information when they made their predictions. Pinatubo was just as strong as they expected, perhaps stronger.
Finally, after all of that, we come to the bottom line, the real question. What was the difference in the total effect of the volcano, both in observations and in reality? What overall difference did it make to the temperature?
Looking at Fig. 3 we can see that there is a difference in more than just maximum temperature drop between model results and data. In the model results, the temperature dropped earlier than was observed. It also dropped faster than actually occurred. Finally, the temperature stayed below normal for longer in the model than in reality.
To measure the combined effect of these differences, we use the sum of the temperature variations, from before the eruption until the temperature returned to pre-eruption levels. It gives us the total effect of the eruption, in “degree-months”. One degree-month is the result of changing the global temperature one degree for one month. It is the same as lowering the temperature half a degree for two months, and so on.
It is a measure of how much the volcano changed the temperature. It is shown in Fig. 3 as the area enclosed by the horizontal colored lines and their respective average temperature data (heavier same color lines). These lines mark the departure from and return to pre-eruption conditions. The area enclosed by each of them is measured in “degree – months” (degrees vertically times months horizontally).
The observations showed that Pinatubo caused a total decrease in the global average temperature of eight degree-months. This occurred over a period of 46 months, until temperatures returned to pre-eruption levels.
The model, however, predicted twice that, sixteen degree-months of cooling. And in the model, temperatures did not return to pre-eruption conditions for 63 months. So that’s the bottom line at the end of the story — the model predicted twice the actual total cooling, and predicted it would take fifty percent longer to recovery than actually happened … bad model, no cookies.
Now, there may be an explanation for that poor performance that I’m not seeing. If so, I invite Dr. Lacis or anyone else to point it out to me. Absent any explanation to the contrary, I would say that if this is his evidence for the accuracy of the models, it is an absolute … that it is a perfect … well, upon further reflection let me just say that I think the study and prediction is absolutely perfect evidence regarding the accuracy of the models, and I thank Dr. Lacis for bringing it to my attention.
[UPDATE] A number of the commenters have said that the Pinatubo prediction wasn’t all that wrong and that the model didn’t miss the mark by all that much. Here’s why that is not correct.
Hansen predicted what is called a “three sigma” event. He got about a two sigma event (2.07 sigma). “Sigma” is a measure of how common it is for something to occur. However, it is far from linear.
A two sigma event is pretty common. It occurs about one time in twenty. So in a dataset the size of GISSTEMP (130 years) we would expect to find somewhere around 130/20 = six or seven two sigma interannual temperature changes. These are the biggest of the inter-annual temperature swings. And in fact, there are eight two-sigma temperature swings in the GISSTEMP data.
A three sigma event, on the other hand, is much, much rarer. It is a one in a thousand event. The biggest inter-annual change in the record is 2.7 sigma. There’s not a single three sigma year in the entire dataset. Nor would we expect one in a 130 year record.
So Hansen was not just making a prediction of something usual. He was making a prediction that we would see a temperature drop never before seen, a once in a thousand year drop.
Why is this important? Remember that Lacis is advancing this result as a reason to believe in climate models.
Now, suppose someone went around saying his climate model was predicting a “thousand-year flood”, the huge kind of millennial flood never before seen in people’s lifetimes. Suppose further that people believed him, and spent lots of money building huge levees to protect their homes and cities and jacking up their houses above predicted flood levels.
And finally, suppose the flood turned out to be the usual kind, the floods that we get every 20 years or so.
After that, do you think the flood guy should go around citing that prediction as evidence that his model can be trusted?
But heck, this is climate science …
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.



That was a great examination! Thanks to WUWT I know a bit more about this perplexing issue. I spend more and more time on this website. A donation is in order!
When I was validating thermal-hydraulics computer models against data, a result as bad as that would have meant back to the drawing-board.
Few computer simulations ever match reality exactly. Glibly dismissing a model as “bad” because it did not predict the observations exactly is foolish. At no point have you discussed what kind of accuracy you think a model should have, for you to consider it “good”. If you want to assess a model realistically, you need to understand such factors as the uncertainty in the observations, the underlying assumptions, and any simplifications that have been made, among other things.
What could possibly account for this error where the forcing is correct but the model output is wrong? Oh ya, if the models have the climate sensitivity wrong.
Perhaps warmists are so excited by the Pintaubo modeled vs actual results because it’s the closest they ever came to reality.
Trying to model the impacts of volcanic eruptions on temperature is futile because there’s no way of segregating volcanic impacts from the “noise” created by El Niños and La Niñas and other short-term influences. There are at least ten short-term cooling episodes in the 20th century temperature record that look just like Pinatubo but which are unrelated to volcanic eruptions. And not all volcanic eruptions caused temperatures to decrease. One of the largest (El Chichón 1982) was in fact followed by a temperature increase.
Volcanic eruptions receive so much attention because the temperature record shows cooling after 1940, and climate models can’t hindcast this cooling without a significant volcanic contribution (although they still do a pretty poor job of it). In other words, the models simulate just one short-term forcing – the one that fits the theory – and ignore all the rest.
Dear Willis
I’m afraid your invitation, though not fallen on deaf ears, will not be accepted.
I help horse owners (mostly ladies) deal with their “scared” horses. ( as a hobby)
Horses are both very intelligent and notorious big chickens. As soon as the owner shows up at the paddock with a lead rope and halter, the horse bolts to the other end of the paddock. A game of frustrating cat n mouse ensues.
I teach these people how to avoid all that.
You my friend, often turn up at the paddock, not just with a rope n halter in one hand, but with a saddle over one shoulder (announcing that you intend to ride him) and a whip cracking away in the other hand, announcing the ride will be hard and painful.
The rest of us love the way you crack that whip. It’s a veritable work of art. And you explain how you crack that whip enabling us to learn, rather like the demonstrations at agricultural shows.
But I’m afraid your invite will never be accepted because as they say in the horse training classics..
You are not dealing with horses 😉
The truly amazing thing to me is that there are so few attempts to test the models. Because of the Pinatubo paper it can be assumed agreed that the way to test a model is to input real data from the past, run the model, finally compare the run with what was actually observed. Vast amounts of time and money have been spent, massive distortions imposed on the economies (of some countries) because of the CAGW theory. But the tests of the theory that are available have not been performed. Sure, it might take a lot of work to perfect these tests and they will no doubt be subject to criticism like all of science. But that they have not been tried …. it staggers the imagination.
The great philosopher Karl Popper demarcated science by the capacity to admit falsification. That massive insight seems to have been lost, and perhaps more recent philosophy of science is culpable there. Possibly now that CAGW is in retreat the requisite testing will start to be done?
Willis says
“Here’s Lacis again talking about how the real-world outcome validated the model results. (Does anyone else find this an odd first choice when asked for evidence that climate models work? It is a 20-year-old study by Lacis. Is this his best evidence he has?)”.
This surely is the key question. If the predictive ability is so poor, over such a short timescale and with such well defined starting parameters, one can only speculate about 50 to 100 years predictive ability.
That is most interesting. When astrophysicists are asked about the relationship between solar activity and weather, they often point to 1816, the year without a summer. A year of extraordinary solar quiescence. Warmists counter that solar activity had little to do with it, that it was the eruption of Tambora the year before. Given the shallow atmospheric response as well as the very quick recovery evidenced by the Pinatubo temperatures above, I suspect both are right.
http://en.wikipedia.org/wiki/Year_Without_a_Summer
Marlene Anderson says:
December 29, 2010 at 7:53 am
Perhaps warmists are so excited by the Pintaubo modeled vs actual results because it’s the closest they ever came to reality.
Bingo! At least they got the sign right.
@Houston, we have a problem…
I completely agree. This is/was a good opportunity to improve their parametrization. However, the bind that Lacis et al are in is that they have claimed certainty, and now cannot improve their models, or data analysis methods, for fear of admitting that they were wrong.
Their only recourse at this point has been to ramp up the cut-and-paste publications, and hope that policy changes are enacted, such that they can claim that disaster was averted. (cf. the CFC fiasco). The situation may well still play out in this direction – the UN, politicians, and bankers are powerful allies. But whatever is going on in climatology – science left long ago.
“It is obvious that the GISS model was too sensitive to the effect of aerosols, and the aerosol “knob” needs to be turned down a bit.”
If Houston is correct then this puts the “aerosols were the problem” for the cooling of 1940-1970’s at risk as an explanation.
Interesting post Willis,
Atmospheric mixing, the goofy notion of Global temperature, and assumptions about the SO2 extent are factors that come to mind related to the inaccuracy of the model.
The eruption occurred on June 15 from 1:45 – 10:45 pm from the Manila area dumping ash and aerosols in a South Eastern direction towards Singapore in the South to Bangkok in the North. At the time of the eruption Tropical Storm Yunya was passing 75 km (47 miles) to the northeast of Mount Pinatubo, causing a large amount of rainfall in the region.
It said to have injected millions of tons of SO2 into the Troposphere which created a “global” Sulfuric Acid haze in the upper atmosphere but the question seems to be extent?
Are the estimates wrong because most of the SO2 ended up as acid rain in the Indian Ocean and because most of the “global” temperature readings are from the Northern Hemisphere?
Global impacts take time and, as I understand it, the atmospheric mixing moves from the Northern Hemisphere to the Southern. The greatest temperature impact would have been in the Southern Hemisphere?
Good post.
Another brick in the wall.
Either the models are way too sensitive or maybe it was the wrong type of ash.
Stevo (7:46 AM),
I think that the point is that the model identified by a key proponent as being a vindication of the rigor of the model predictions overestimates the sensitivity by a factor of two and the duration of the effect by a substantial amount. The net result is to underscore the uncertainties of aerosol forcings used in the models. So, the “best” example of model validation is apparently not rigorous at all.
It would not be surprising if the same sensitivity errors exist for water aerosols (clouds), which is a matter of current debate.
It appears that world economies are being put under severe burdens based on theoretical modeling that doesn’t predict well at all.
“If you want to asess a model realistically, you need to understand such factors as the uncertainty in the observations, the underlying assumptions, and any simplifications that have been made, among other things.”
You should tell Hansen, et al exactly the same thing, as these are the very reasons that models and computer simulations on climate don’t work. There is just too much uncertainty and lack of understanding of the interelated processes that drive climate, not to mention the bad faith adjustments and temperature smoothing practiced by the global warming/climate chaos/climate change gang of junk scientists.
Really, I’ve lost faith in science in the public interest as it has been mostly taken over by a gang of leftist/liberal activists intent on controlling and redistributing the wealth of the Western World, mostly in an attempt to destroy the very institutions that built the wealth they enjoy the fruits of. These people hate the West, hate themselves(if only they were born a minority), and wish to destroy all vestiges of Western Civilization. One only has to look at the groups supporting the green iniciatives and global warming nonsense to see the truth, although there are a few well-meaning but useful idiots who support them as well.
TO: Houston… and Stevo– personally I agree with your characterization of modelling. I think all honest skeptics would. The point of this sort of derision towards flawed model output– and if Willis is correct the output was flawed– is that the model output is the source of the “proof” of catastrophic AGW according to the alarmists. The alarmists make extraordinary claims of AGW and the need for the technocrats to take over the pricing and use of the world’s energy sources. Hence they must be held to a standard of extraordinary proof. In response the alarmists deliver models such as the GISS model. If the model does indeed fail this simple test, what’s left of the alarmists’ case? That’s the point. All models are continuously refined until they WORK. If the output never improves, they are discarded– at least in the REAL world. My personal skeptic case is that the global climate may be either to complex or too chaotically random to ever properly model. The GISS model has done nothing to rebut that skepticism, much less justify turning over energy production to the likes of the UN, ALGore and Jim Hansen.
Another interesting factor to consider, if you roll the “way-back machine” to 1991, we appear to have been moving into an El Nino which would have compensated for a significant portion of the cooling.
But the AGW community will gliby pass off “scarier-than-real” results from models like this as ‘reality’, which we foolish folks are supposed to believe.
Time for a “reality check”.
Cracking good first question to Lacis (he said, not allowing false modesty to overwhelm him)
And its worth noting that rather than the 5 or 10 ‘proof points’ that he was offered the chance to discuss from the whole palette of climate modelling, he came up with just the one that Willis discusses and one other, also by himself. And made no better a job of it either.
It is quite remarkable how little the individual Climatologists know of each others work, while feeling able to dismiss any ‘outsiders’ criticisms as ignorant or unqualified.
Taking a wider look at GCMs and volcanoes two things stand out:
1. Models underestimate the variation in temperature.
2. There are often large drops in observed temperature which are not related to volcanoes; in the case of models the only drops in temperature are volcano related.
http://www.climatedata.info/Forcing/Forcing/volcanoes.html
OK, here’s this high school educated, general contractor, layperson’s short summary of what this indicates.
The climate models are hyper sensitive and greatly exaggerate the climate’s reaction to atmospheric injection of material.
So I make the conclusion that the models exaggerated the reaction to volcanic injection from Pinatubo just as they do the reaction from CO2 emissions.
Therefore climate models are not “robust” and anyone claiming they are need to face ethics charges or become a contractor.
Am I missing anything?
“Stevo says:
December 29, 2010 at 7:46 am (Edit)
Few computer simulations ever match reality exactly. Glibly dismissing a model as “bad” because it did not predict the observations exactly is foolish. At no point have you discussed what kind of accuracy you think a model should have, for you to consider it “good”. If you want to assess a model realistically, you need to understand such factors as the uncertainty in the observations, the underlying assumptions, and any simplifications that have been made, among other things.”
yup.
“So, rather than a opportunity for patting themselves on the back for “getting the direction right,” they should have taken the opportunity to improve their model.”
yup.
If one looks at the S02 as merely a shield to incoming solar, the the earlier drop in the model could be due to the modelling of the dispersion of the aerosol after the event.
I would expect their model of the release of S02 from the volcano to not have much fidelity. The depth of the cooling would indicate that the negative forcing is off and the
quick rebound could be due to a poor model of the residency time of the aerosol.
Just the other day i was watching a great video of a statistician who was talking about emulating GCM results. What two parameters of the 32 paramaters were the MOST sensitive to perturbation.
1. whether the slab ocean was on or coupled ocean was on.
2. whether the sulfur cycle was off or on.
The model hansen used in this period had a sulfur cycle that is now 20 years old
and I believe it used a slab ocean. It would be neat to see how a current GCM with a fully coupled ocean and a much better sulfur cycle would do.
For some interesting reading on the sulfur cycle in models see this
http://www.google.com/url?sa=t&source=web&cd=2&ved=0CCMQFjAB&url=http%3A%2F%2Facdb-ext.gsfc.nasa.gov%2FPeople%2FChin%2Fchin.jgr.2000a.pdf&ei=JGYbTaHBGY_EsAOKv9DYCg&usg=AFQjCNEg4zW55YP-G_5bN5Fsa5m7ELY6_Q&sig2=lEFbnkSqnhOz3ULuo-tN3Q
some comparisons between various models and a bit of insight into the complexity of getting it exactly right.
That said, one wants to know how well sun spots or magnetic fields or the thunderstorm thermostat do in a similar test. As models of the climate they are silent on the effect of aerosols.