Model Charged with Excessive Use of Forcing

Guest Post by Willis Eschenbach

The GISS Model E is the workhorse of NASA’s climate models. I got interested in the GISSE hindcasts of the 20th century due to an interesting posting by Lucia over at the Blackboard. She built a simple model (which she calls “Lumpy”) which does a pretty good job of emulating the GISS model results, using only a model including forcings and a time lag. Stephen Mosher points out how to access the NASA data here (with a good discussion), so I went to the NASA site he indicated and got the GISSE results he points to. I plotted them against the GISS version of the global surface air temperature record in Figure 1.

Figure 1. GISSE Global Circulation Model (GCM or “global climate model”) hindcast 1880-1900, and GISS Global Temperature (GISSTemp) Data. Photo shows the new NASA 15,000-processor “Discover” supercomputer. Top speed is 160 trillion floating point operations per second (a unit known by the lovely name of “teraflops”). What it does in a day would take my desktop computer seventeen years.

Now, that all looks impressive. The model hindcast temperatures are a reasonable match both by eyeball and mathematically to the observed temperature. (R^2 = 0.60). True, it misses the early 20th century warming (1920-1940) entirely, but overall it’s a pretty close fit. And the supercomputer does 160 teraflops. So what could go wrong?

To try to understand the GISSE model, I got the forcings used for the GISSE simulation. I took the total forcings, and I compared them to the GISSE model results. The forcings were yearly averages, so I compared them to the yearly results of the GISSE model. Figure 2 shows a comparison of the GISSE model hindcast temperatures and a linear regression of those temperatures on the total forcings.

Figure 2. A comparison of the GISSE annual model results with a linear regression of those results on the total forcing. (A “linear regression” estimates the best fit of the forcings to the model results). Total forcing is the sum of all forcings used by the GISSE model, including volcanos, solar, GHGs, aerosols, and the like. Deep drops in the forcings (and in the model results) are the result of stratospheric aerosols from volcanic eruptions.

Now to my untutored eye, Fig. 2 has all the hallmarks of a linear model with a missing constant trend of unknown origin. (The hallmarks are the obvious similarity in shape combined with differing trends and a low R^2.) To see if that was the case I redid my analysis, this time including a constant trend. As is my custom, I merely included the years of the observation in the analysis to get that trend. That gave me Figure 3.

Figure 3. A comparison of the GISSE annual model results with a regression of the total forcing on those results, including a constant annual trend. Note the very large increase in R^2 compared to Fig. 2, and the near-perfect match of the two datasets.

There are several surprising things in Figure 3, and I’m not sure I see all of the implications of those things yet. The first surprise was how close the model results are to a bozo simple linear response to the forcings plus the passage of time (R^2 = 0.91, average error less than a tenth of a degree). Foolish me, I had the idea that somehow the models were producing some kind of more sophisticated, complex, lagged, non-linear response to the forcings than that.

This almost completely linear response of the GISSE model makes it trivially easy to create IPCC style “scenarios” of the next hundred years of the climate. We just use our magic GISSE formula, that future temperature change is equal to 0.13 times the forcing change plus a quarter of a degree per century, and we can forecast the temperature change corresponding to any combination of projected future forcings …

Second, this analysis strongly suggests that in the absence of any change in forcing, the GISSE model still warms. This is in agreement with the results of the control runs of the GISSE and other models that I discussed st the end of my post here. The GISSE control runs also showed warming when there was no change in forcing. This is a most unsettling result, particularly since other models showed similar (and in some cases larger) warming in the control runs.

Third, the climate sensitivity shown by the analysis is only 0.13°C per W/m2 (0.5°C per doubling of CO2). This is far below the official NASA estimate of the response of the GISSE model to the forcings. They put the climate sensitivity from the GISSE model at about 0.7°C per W/m2 (2.7°C per doubling of CO2). I do not know why their official number is so different.

I thought the difference in calculated sensitivities might be because they have not taken account of the underlying warming trend of the model itself. However, when the analysis is done leaving out the warming trend of the model (Fig. 2), I get a sensitivity of 0.34°C per W/m2 (1.3°C per doubling, Fig. 2). So that doesn’t solve the puzzle either. Unless I’ve made a foolish mathematical mistake (always a possibility for anyone, check my work), the sensitivity calculated from the GISSE results is half a degree of warming per doubling of CO2 …

Troubled by that analysis, I looked further. The forcing is close to the model results, but not exact. Since I was using the sum of the forcings, obviously in their model some forcings make more difference than other forcings. So I decided to remove the volcano forcing, to get a better idea of what else was in the forcing mix. The volcanos are the only forcing that makes such large changes on a short timescale (months). Removing the volcanos allowed me to regress all of the other forcings against the model results (without volcanos), so that I could see how they did. Figure 4 shows that result:

Figure 4. All other forcings regressed against GISSE hindcast temperature results after volcano effect is removed. Forcing abbreviations (used in original dataset): W-M_GHGs = Well Mixed Greenhouse Gases; O3 = Ozone; StratH2O = Stratospheric Water Vapor; Solar = Energy From The Sun; LandUse = Changes in Land Use and Land Cover; SnowAlb = Albedo from Changes in Snow Cover; StratAer = Stratospheric Aerosols from volcanos; BC = Black Carbon; ReflAer = Reflective Aerosols; AIE = Aerosol Indirect Effect. Numbers in parentheses show how  well the various forcings explain the remaining model results, with 1.0 being a perfect score. (The number is called R squared, usually written R^2) Photo Source

Now, this is again interesting. Once the effect of the volcanos is removed, there is very little difference in how well the other forcings explain the remainder. With the obvious exception of solar, the R^2 of most of the forcings are quite similar. The only two that outperform a simple straight line are stratospheric water vapor and GHGs, and that is only by 0.01.

I wanted to look at the shape of the forcings to see if I could understand this better. Figure 5 has NASA GISS’s view of the forcings, shown at their actual sizes:

Figure 5: The radiative forcings used by the GISSE model as shown by GISS. SOURCE

Well, that didn’t tell me a lot (not GISS’s fault, just the wrong chart for my purpose), so I took the forcing data, standardized it, and took a look at the forcings in a form in which they could be seen. I found out the reason that they all fit so well lies in the shape of the forcings. All of them increase slowly (either negatively or positively) until 1950. After that, they increase more quickly. To see these shapes, it is necessary to standardize the forcings so that they all have the same size. Figure 6 shows what the forcings used by the model look like after standardization:

Figure 6. Forcings for the GISSE model hindcast 1880-2003. Forcings have been “standardized” (set to a standard deviation of 1.0) and set to start at zero as in Figure 4.

There are several oddities about their forcings. First, I had assumed that the forcings used were based at least loosely on reality. To make this true, I need to radically redefine “loosely”. You’ll note that by some strange coincidence, many of the forcings go flat from 1990 onwards … loose. Does anyone believe that all those forcings (O3, Landuse, Aerosol Indirect, Aerosol Reflective, Snow Albedo, Black Carbon) really stopped changing in 1990? (It is possible that this is a typographical or other error in the dataset. This idea is supported by the slight post-1990 divergence of the model results from the forcings as seen in Fig. 3)

Next, take a look at the curves for snow albedo and black carbon. It’s hard to see the snow albedo curve, because it is behind the black carbon curve. Why should the shapes of those two curves be nearly identical? … loose.

Next, in many cases the “curves” for the forcings are made up of a few straight lines. Whatever the forcings might or might not be, they are not straight lines.

Next, with the exception of solar and volcanoes, the shape of all of the remaining forcings is very similar. They are all highly correlated, and none of them (including CO2) is much different from a straight line.

Where did these very strange forcings come from? The answer is neatly encompassed in “Twentieth century climate model response and climate sensitivity”, Kiehl, GRL 2007 (emphasis mine):

A large number of climate modeling groups have carried out simulations of the 20th century. These simulations employed a number of forcing agents in the simulations. Although there are established data for the time evolution of well-mixed greenhouse gases [and solar and volcanos although Kiehl doesn’t mention them], there are no established standard datasets for ozone, aerosols or natural forcing factors.

Lest you think that there is at least some factual basis to the GISSE forcings, let’s look again at black carbon and snow albedo forcing. Black carbon is known to melt snow, and this is an issue in the Arctic, so there is a plausible mechanism to connect the two. This is likely why the shapes of the two are similar in the GISSE forcings. But what about that shape, increasing over the period of analysis? Here’s one of the few actual records of black carbon in the 20th century, from 20th-Century Industrial Black Carbon Emissions Altered Arctic Climate Forcing, Science Magazine (paywall)

Figure 7. An ice core record from the Greenland cap showing the amount of black carbon trapped in the ice, year by year. Spikes in the summer are large forest fires.

Note that rather than increasing over the century as GISSE claims, the observed black carbon levels peaked in about 1910-1920, and have been generally decreasing since then.

So in addition to the dozens of parameters that they can tune in the climate models, the GISS folks and the other modelers got to make up some of their own forcings out of the whole cloth … and then they get to tell us proudly that their model hindcasts do well at fitting the historical record.

To close, Figure 8 shows the best part, the final part of the game:

Figure 8. ORIGINAL IPCC CAPTION (emphasis mine). A climate model can be used to simulate the temperature changes that occur from both natural and anthropogenic causes. The simulations in a) were done with only natural forcings: solar variation and volcanic activity. In b) only anthropogenic forcings are included: greenhouse gases and sulfate aerosols. In c) both natural and anthropogenic forcings are included. The best match is obtained when both forcings are combined, as in c). Natural forcing alone cannot explain the global warming over the last 50 years. Source

Here is the sting in the tale. They have designed the perfect forcings, and adjusted the model parameters carefully, to match the historical observations. Having done so, the modelers then claim that the fact that their model no longer matches historical observations when you take out some of their forcings means that “natural forcing alone cannot explain” recent warming … what, what?

You mean that if you tune a model with certain inputs, then remove one or more of the inputs used in the tuning, your results are not as good as with all of the inputs included? I’m shocked, I tell you. Who would have guessed?

The IPCC actually says that because the tuned models don’t work well with part of their input removed, this shows that humans are the cause of the warming … not sure what I can say about that.

What I Learned

1. To a very close approximation (R^2 = 0.91, average error less than a tenth of a degree C) the GISS model output can be replicated by a simple linear transformation of the total forcing and the elapsed time. Since the climate is known to be a non-linear, chaotic system, this does not bode well for the use of GISSE or other similar models.

2. The GISSE model illustrates that when hindcasting the 20th century, the modelers were free to design their own forcings. This explains why, despite having climate sensitivities ranging from 1.8 to 4.2, the various climate models all provide hindcasts which are very close to the historical records. The models are tuned, and the forcings are chosen, to do just that.

3. The GISSE model results show a climate sensitivity of half a degree per doubling of CO2, far below the IPCC value.

4. Most of the assumed GISS forcings vary little from a straight line (except for some of them going flat in 1990).

5. The modelers truly must believe that the future evolution of the climate can be calculated using a simple linear function of the forcings. Me, I misdoubts that …

In closing, let me try to anticipate some objections that people will likely have to this analysis.

1. But that’s not what the GISSE computer is actually doing! It’s doing a whole bunch of really really complicated mathematical stuff that represents the real climate and requires 160 teraflops to calculate, not some simple equation. This is true. However, since their model results can be replicated so exactly by this simple linear model, we can say that considered as black boxes the two models are certainly equivalent, and explore the implications of that equivalence.

2. That’s not a new finding, everyone already knew the models were linear. I also thought the models were linear, but I have never been able to establish this mathematically. I also did not realize how rigid the linearity was.

3. Is there really an inherent linear warming trend built into the model? I don’t know … but there is something in the model that acts just like a built-in inherent linear warming. So in practice, whether the linear warming trend is built-in, or the model just acts as though it is built-in, the outcome is the same. (As a side note, although the high R^2 of 0.91 argues against the possibility of things improving a whole lot by including a simple lagging term, Lucia’s model is worth exploring further.)

4. Is this all a result of bad faith or intentional deception on the part of the modelers? I doubt it very much. I suspect that the choice of forcings and the other parts of the model “jes’ growed”, as Topsy said. My best guess is that this is the result of hundreds of small, incremental decisions and changes made over decades in the forcings, the model code, and the parameters.

5. If what you say is true, why has no one been able to successfully model the system without including anthropogenic forcing?

Glad you asked. Since the GISS model can be represented as a simple linear model, we can use the same model with only natural forcings. Here’s a first cut at that:

Figure 9. Model of the climate using only natural forcings (top panel). All forcings model from Figure 3 included in lower panel for comparison. Yes, the R^2 with only natural forcings is smaller, but it is still a pretty reasonable model.

6. But, but … you can’t just include a 0.42 degree warming like that! For all practical purposes, GISSE does the same thing only with different numbers, so you’ll have to take that up with them. See the US Supreme Court ruling in the case of Sauce For The Goose vs. Sauce For The Gander.

7. The model inherent warming trend doesn’t matter, because the final results for the IPCC scenarios show the change from model control runs, not absolute values. As a result, the warming trend cancels out, and we are left with the variation due to forcings. While this sounds eminently reasonable, consider that if you use their recommended procedure (cancel out the 0.25°C constant inherent warming trend) for their 20th century hindcast shown above, it gives an incorrect answer … so that argument doesn’t make sense.

To simplify access to the data, I have put the forcings, the model response, and the GISS temperature datasets online here as an Excel worksheet. The worksheet also contains the calculations used to produce Figure 3.

And as always, the scientific work of a thousand hands continue.

Regards,

w.

 

[UPDATE: This discussion continues at Where Did I Put That Energy.]

Advertisements

  Subscribe  
newest oldest most voted
Notify of

So if someone commits fraud to gain time/access to a supercomputer then is that also considered theft? How much is the time of a supercomputer worth?

Horace the Grump

Well who’d a thunk it… another nail in the coffin of trying to base policy only based on ‘models’, which for all their teraflopiness are really rather simple and by the looks of it not very good at all…

RayG

This is a classic example of why the data, methodologies, code, etc. require rigorous scrutiny, quality control, version control, archiving AND access. This kind of analysis can only be done if the data, widely defined, are available. Does anyone really wonder why this is proving to be so difficult to obtain?

Neville

This is an insufficiently understood characteristic of complex modeling. After a certain point, the only real impact of adding complexity becomes its ability to conceal from the modelers the basic nature of what they have done.
As in so many other areas, the climate modelers seem here to have magnified a common scientific error to the point of absurd self-parody. If they now choose to go with the usual flat denial response, their absurdity will only be more apparent, and the eventual judgment of history will only be more damning.

Peter Hartley

I think that instead of saying “a linear regression of the total forcings on those temperatures” you meant to say “a linear regression of the temperatures on those total forcings”. You are predicting the temperatures using the forcings, not the other way around.

Bart

FTA:” 2. The GISSE model illustrates that when hindcasting the 20th century, the modelers were free to design their own forcings.”
Which means that, essentially, all it is, is a glorified multivariable curve fit, and the CAGWers have convinced themselves that it is somehow miraculous that the curve fit fits the data, and that this therefore confirms their worst fears.
This is the kind of (non) thinking which brought humankind voodoo dolls and leeches. How depressingly… primitive.

tty

Now, that 0.25 degree increase per century in the absence of any change in forcing is interesting, and really makes any action on GHG’s quite irrelevant, since it implies that the oceans will start boiling before we’re even halfway through the next glacial cycle no matter what we do.

RayG

Judith Curry has resumed her thread on Climate model verification and validation: Part II at Climate Etc judithcurry.com/2010/12/18/climate-model-verification-and-validation-part-ii/ Her reason is the interest that an invited paper received at AGU last week. The title of the paper is: “Do Over or Make Do? Climate Models as a Software Development Challenge (Invited)” and is found at adsabs.harvard.edu/abs/2010AGUFMIN14B..01E I reproduce the abstract below.Please delete if there are any IP issues. As several of my friends in the legal profession say, res ipsa loquitor.
“We present the results of a comparative study of the software engineering culture and practices at four different earth system modeling centers: the UK Met Office Hadley Centre, the National Center for Atmospheric Research (NCAR), The Max-Planck-Institut für Meteorologie (MPI-M), and the Institut Pierre Simon Laplace (IPSL). The study investigated the software tools and techniques used at each center to assess their effectiveness. We also investigated how differences in the organizational structures, collaborative relationships, and technical infrastructures constrain the software development and affect software quality. Specific questions for the study included 1) Verification and Validation – What techniques are used to ensure that the code matches the scientists’ understanding of what it should do? How effective are these are at eliminating errors of correctness and errors of understanding? 2) Coordination – How are the contributions from across the modeling community coordinated? For coupled models, how are the differences in the priorities of different, overlapping communities of users addressed? 3) Division of responsibility – How are the responsibilities for coding, verification, and coordination distributed between different roles (scientific, engineering, support) in the organization? 4) Planning and release processes – How do modelers decide on priorities for model development, how do they decide which changes to tackle in a particular release of the model? 5) Debugging – How do scientists debug the models, what types of bugs do they find in their code, and how they find them? The results show that each center has evolved a set of model development practices that are tailored to their needs and organizational constraints. These practices emphasize scientific validity, but tend to neglect other software qualities, and all the centers struggle frequently with software problems. The testing processes are effective at removing software errors prior to release, but the code is hard to understand and hard to change. Software errors and model configuration problems are common during model development, and appear to have a serious impact on scientific productivity. These problems have grown dramatically in recent years with the growth in size and complexity of earth system models. Much of the success in obtaining valid simulations from the models depends on the scientists developing their own code, experimenting with alternatives, running frequent full system tests, and exploring patterns in the results. Blind application of generic software engineering processes is unlikely to work well. Instead, each center needs to lean how to balance the need for better coordination through a more disciplined approach with the freedom to explore, and the value of having scientists work directly with the code. This suggests that each center can learn a lot from comparing their practices with others, but that each might need to develop a different set of best practices.”

Where is the planetary mechanics influence on solar climate which solely dictates our climate?
The Landscheidt Grand Solar Minimum that started in 1990 is grinding to its peak in 2030. And CO2 can do nothing to stop the increasing brutal cold and crop-killing famine that will result. Nothing.

Brilliant paper. Needs to be published far and wide, starting in Journal of Climate or similar, except it’s over their heads!

Squidly

GIGO….

Squidly

RayG,
Yes, as I have written about extensively. The state of GISSE, or any of the other models I have inspected, would not come close to passing muster anywhere I have worked (except Gov.). I have been designing and developing software for something closing in on 30 years now, even in the early days we maintained higher levels of controls and scrutiny. Today it is mandatory, or you don’t eat.

Madman2001

Graphs with background photos disrupt and disturb comprehension of the data while adding nothing but prettification. The first photo, with the many colors and sharp contrast, is particularly distracting and is an excellent example of chartjunk:
http://en.wikipedia.org/wiki/Chartjunk

Bill Illis

The temperature impact from GHG forcing in Model E follows 4.053 ln(CO2) -23.0 so it is not quite linear (using CO2 as a proxy for all the GHGs). We are just in a particular part of the formula which is close to linear right now.
They are not playing around with the GHG forcings, it is all the other forcings like Aerosols and the unrealistically high Volcano forcings that are being used for the plugs to match the historical record.
http://img183.imageshack.us/img183/6131/modeleghgvsotherbc9.png
The TempC response per watt/m2 has always bothered me. One needs to assume all the feedbacks will occur to get to the higher numbers often quoted (1 W/m2 of GHG forcing results in an additional 2 W/m2 of water vapour and Albedo feedbacks). Hansen also assumes there is lag as the oceans absorb some of the forcing and then some of the feedbacks like Albedo are more long-term. The response could start out at 0.5C/W/m2 and rise to 0.81C/W/m2 after the lags kick in.
But GISS Model E net forcing was +1.9 W/m2 in 2003 and that would only produce 0.34C/W/m2 of response (including all the feedbacks). After 2003, the oceans stopped absorbing some of the forcing so it might even be falling from this low number. It is probably the actual response that the Earth’s climate gives because I have seen this same number in all the historical climate reconstructions I have done.
GHG doubling +3.7W/m2 X 0.34C/W/m2 = +1.26C
REPLY: Bill I sent you an email a few days ago, but got no response. Check your spam folder – Anthony

Bob in Castlemaine

An interesting analysis Willis. Somehow it all looks suspiciously like the process of using a simple mechanical model to fit a known data set. This is the technique used to come up with race horse tipping programs based on historic race results. Unscrupulous scammers continue to sell these race tipping programs to gullible punters.
Doesn’t this seem to have a familiar ring to it?

Wow – nice job Willis!
The models are apparently able to reproduce their inputs. Not exactly predictive.

Cementafriend

Thanks Willis! Will need to look at the spreadsheet.
Everyone knows that the GISS temperatures are not correct and have been exaggerated by selection of sites with UHI effects and the speading of temperatures from these sites to areas where there is no measurement.
If I have your comments correct then it is possible to take out the supposed GHG effect altogether and still be able to model the actual temperatures. This would add to the findings in icecores and past experimental data (such as compiled by Beck) that CO2 lags temperature and so has no effect on climate (or weather).

Squidly

Madman2001,
I would have to agree. I personally don’t care for the background images on the charts. I could live without them.

DirkH

Willis says:
“4. Is this all a result of bad faith or intentional deception on the part of the modelers? I doubt it very much. I suspect that the choice of forcings and the other parts of the model “jes’ growed”, as Topsy said. My best guess is that this is the result of hundreds of small, incremental decisions and changes made over decades in the forcings, the model code, and the parameters.”
IMHO, Hansen and Schneider and the other team leaders wanted to show warming, and their programmers had the task to deliver that warming while doing a good hindcasting. The motivation of everybody in the system was to make this happen, by parameters or by inventing the past history of the forcings. Everybody turned a blind eye on it. It would have been the job of QA to find this. There was no QA. Where there is no QA, anything can happen. Oh, we have peer review, but that was rigged.
That being said – the job of the modelers is even easier than i thought. They are all natural born slackers. We’ve all been taken for a ride.

Squidly

I am wondering what they do with all of those extra teraflops, sounds to me like I could do the same processing on my WII at 2.5 MIPS (million instructions per second) with equal results (and a little bit cheaper). Maybe those extra teraflops are contributing to catastrophic warming? Perhaps someone should design another model and look in to that…

TimTheToolMan

160 teraflops? I could do that with a pencil and some graph paper. Oh and a pencil sharpener. Better make that two pencils. Geez, the expenses just keep piling up!

The use and reference to a 160 teraflops capable machine to add heft to the credibility of models/simulations, calls to mind the tale of a mega-rancher in Texas. Wanting to know why his black cattle ate more grain than his white cattle, he hired a team of experts and leased a couple of Cray computers. After a year of effort, the report concluded that he had more black cattle.

James Barker

I an not a scientist, I do read quite a bit, and perhaps understand some of it. A computer model can only attempt to simulate reality (however defined). And then, as I understand it, must be verified by actually measuring the reality that was simulated. The KISS principle seems to tell me, that if you must make up fudge factors, to get the model to work, then the model itself didn’t simulate this reality at all. We may learn quite a bit about the modelers intentions, by studying their efforts, but nothing at all about the reality we are studying.

Baa Humbug

Thanx Mr E, I love a free education.
This validates what Richard S Courtney has been saying all along, I’m sure he’ll be along soon to verify.
Now we await either Gavin (no chance) or Lacis to show up (nil chance)

Willis Eschenbach

Cementafriend says:
December 19, 2010 at 3:38 pm


If I have your comments correct then it is possible to take out the supposed GHG effect altogether and still be able to model the actual temperatures.

Only in the simplest sense. The problem is that both my model (and apparently the GISS model) contain a built-in trend. This makes their predictive value something like zero.

Willis Eschenbach says:
December 19, 2010 at 4:14 pm
“Only in the simplest sense. The problem is that both my model (and apparently the GISS model) contain a built-in trend. This makes their predictive value something like zero.”
If that is the case, then why didn’t you post my question? I was curious for an answer from a legal perspective.

Nice willis,
When Lucia was working on Lumpy I suggested that she use the model to do quick and dirty IPCC forecasts. Now, having been to AGU I sat through a talk where a nice wizard lady took GCM results and created a simular emulation using regressions on the results. This allowed them to do many more hindcasts as part of a paleo recon, where the GCM was used as a prior in a baysian approach to proxy recons.
I don’t have time to go over the details of your handling of the forcing ( maybe Lucia can chime in) But the point you make about attribution studies bears some looking into. I’ll remind you that in the attribution studies they only used models that have neglible drift. ( see the Supplemental material chapter 9) Also, the comparison against observations is done in a unique way.
The black carbon stuff was interesting as well

Carl Chapman

David Attenborough did a little Utube showing CO2 must be the cause because the models predicted the output so perfectly if CO2 was included. I knew then that they were tuning the models, because the match was too perfect. I think it was Anthony who pointed out that the lines (model and actual) crossed every few months and never diverted by more than a small fraction of a degree.

Thanks, Willis. As always, your findings are educational, and your presentation is entertaining–a wonderful combination.

Willis Eschenbach

Steven Mosher says:
December 19, 2010 at 4:21 pm

Nice willis, … [good stuff snipped]

Thanks, Mosh. My concern would be that although a model shows little drift in the control runs, it may show drift in the presence of forcings. And without testing, we don’t know …

RayG

Judith Curry is starting another thread on “Climate model verification: Part II.” This seems appropriate given Willis’ analysis. judithcurry.com/2010/12/18/climate-model-verification-and-validation-part-ii/ She mentions tha abstract of an invited paper from last week’s AGU meeting by Easterbrook titled “Do Over or Make Do? Climate Models as a Software Development Challenge (Invited)” The abstract follows from adsabs.harvard.edu/abs/2010AGUFMIN14B..01E Please delete abstract if there are copyright issues with my copying and pasting it here.
“We present the results of a comparative study of the software engineering culture and practices at four different earth system modeling centers: the UK Met Office Hadley Centre, the National Center for Atmospheric Research (NCAR), The Max-Planck-Institut für Meteorologie (MPI-M), and the Institut Pierre Simon Laplace (IPSL). The study investigated the software tools and techniques used at each center to assess their effectiveness. We also investigated how differences in the organizational structures, collaborative relationships, and technical infrastructures constrain the software development and affect software quality. Specific questions for the study included 1) Verification and Validation – What techniques are used to ensure that the code matches the scientists’ understanding of what it should do? How effective are these are at eliminating errors of correctness and errors of understanding? 2) Coordination – How are the contributions from across the modeling community coordinated? For coupled models, how are the differences in the priorities of different, overlapping communities of users addressed? 3) Division of responsibility – How are the responsibilities for coding, verification, and coordination distributed between different roles (scientific, engineering, support) in the organization? 4) Planning and release processes – How do modelers decide on priorities for model development, how do they decide which changes to tackle in a particular release of the model? 5) Debugging – How do scientists debug the models, what types of bugs do they find in their code, and how they find them? The results show that each center has evolved a set of model development practices that are tailored to their needs and organizational constraints. These practices emphasize scientific validity, but tend to neglect other software qualities, and all the centers struggle frequently with software problems. The testing processes are effective at removing software errors prior to release, but the code is hard to understand and hard to change. Software errors and model configuration problems are common during model development, and appear to have a serious impact on scientific productivity. These problems have grown dramatically in recent years with the growth in size and complexity of earth system models. Much of the success in obtaining valid simulations from the models depends on the scientists developing their own code, experimenting with alternatives, running frequent full system tests, and exploring patterns in the results. Blind application of generic software engineering processes is unlikely to work well. Instead, each center needs to lean how to balance the need for better coordination through a more disciplined approach with the freedom to explore, and the value of having scientists work directly with the code. This suggests that each center can learn a lot from comparing their practices with others, but that each might need to develop a different set of best practices.”
I am not sure if some of the climate science modelers would recognize “best practices” if they jumped up and bit them.

Jim D

Willis, perhaps you need to use the average forcing rather than change in forcing to get your factor of two (0.34 versus 0.7)? I don’t see the average forcing being much above 1 W/m2.

RayG

Left of that the comments on Judith’s thread that I mentioned above are fascinating and worth reading.

Willis Eschenbach

Jim D says:
December 19, 2010 at 4:35 pm

Willis, perhaps you need to use the average forcing rather than change in forcing to get your factor of two (0.34 versus 0.7)? I don’t see the average forcing being much above 1 W/m2.

Not sure what you mean by that, Jim. I’ve used standard linear regression to obtain the coefficient 0.34, not average or change in forcing.
w.

Peter Pearson

After Figure 4, “stratospheric ozone” should be “stratospheric water vapor”.

Willis Eschenbach

Peter Pearson says:
December 19, 2010 at 4:56 pm

After Figure 4, “stratospheric ozone” should be “stratospheric water vapor”.

Thanks, fixed.

Willis Eschenbach

Madman2001 says:
December 19, 2010 at 3:30 pm

Graphs with background photos disrupt and disturb comprehension of the data while adding nothing but prettification.

As the Romans used to say, “de gustibus et coloribus non est disputandum”. That means there’s no use arguing about tastes and colors. If you say you like blue, I can’t dispute that.
So while you may dislike graphs with background photos, I like them. I think that they add to the presentation. I don’t mind if people have to study them a bit to figure them out. Plus, I like science to be fun and interesting. Finally, while you seem to think that “prettification” is something to be avoided, I like pretty things. Go figure. De gustibus et coloribus …

Steve Fitzpatrick

Nice post Willis.
One cause for the discrepancy between your low simple regression sensitivity (0.13 degree per watt) and the GISS model sensitivity is the vast assumed accumulation of heat in the oceans in the GISS model (on the order of 0.85 watt per square meter)… which is obviously not correct.
Your analysis of the assumed forcings being a pure kludge to make the different models fit the historical temperature record is absolutely correct. I can image no other field where you can make any model fit the data by making up data sets for unknown inputs, and then claim your model is “verified”. As pure a form in intellectual corruption/self deception as I’ve ever seen in science. The models all disagree about the true sensitivity, yet assume vastly different historical forcings, and still we are told by the IPCC that these models can provide useful information about what will happen in 50 or 100 years, if we form an ‘ensemble’ of models, each of which uses a different set of forcing kludges. All such predictions are pure rubbish, and should be treated as such by the public.

Alan Millar

You are pointing out the obvious Willis.
I have been saying this since day one about the GCMs.
The warmers argue that, whilst their models use different assumptions about sensitivity, areosols etc, that they all show one thing only CO2 can explain the 20th century temperature record. They say if they take out CO2 nothing works. So therefore the observed warming is due to CO2, QED.
Of course it is absolute cobblers. They are in charge of the various other parameters and their values. The models use different assumptions on some of these parameters such as aerosols and that is how they are made to fit a backcast record. Well what a surprise that is!
I always say, send me a random sample of roulette spins and I will send you a model that will win you money. I can do it every time with 100% certainty. Send me 1000 different samples and I will send you a model that proves you can win money. Now a lot of these models will be slightly different, just like the GCMs, but just like the GCMs they all have one thing in common, they all prove you can win money playing roulette.
Therefore, presumably the warmers would have to agree that models have proven you will win money at roulette, QED.
You can have a million GCMs all ‘proving’ that CO2 is the culprit but whilst the designers get to play about with the parameters it proves nothing.
Willis, you attempted to show that solar could be responsible for the 20th century warming and seeing as you could control the parameters you had little difficulty in producing a model that showed this.
The warmers do the same thing with the GCMs but with greater complexity and more obfuscation but the same thing really.
Alan

As so many people have said, before these models become the basis for expensive public policy decisions, we need to see audits — reviews by an unaffiliated multi-disciplinary team of experts. Just like testing of new drugs, third party review is essential.

Jim D

I meant the average total forcing is about 1 W/m2 and the temperature change was 0.7 C, so the sensitivity is 0.7 C per W/m2.

jorgekafkazar

Squidly says: “I am wondering what they do with all of those extra teraflops…”
Horse race programs, Squidders.

jorgekafkazar

Cementafriend said: “…If I have your comments correct then it is possible to take out the supposed GHG effect altogether and still be able to model the actual temperatures.”
Willis Eschenbach replied: “Only in the simplest sense. The problem is that both my model (and apparently the GISS model) contain a built-in trend. This makes their predictive value something like zero.”
Is it the predictive value or the predictive skill of the models that is zero?

Roger Andrews

Willis:
The reason you can’t replicate the IPCC’s estimate of 3C for a doubling of CO2 is that the IPCC’s number is an “equilibrium” sensitivity calculated from temperatures which aren’t reached for at least another 500 years, according to the models. (See Nakashiki 2007 for some illustrative plots.) Table 8.2 of the IPCC AR4 shows roughly a factor of two difference between model-derived transient and equilibrium climate sensitivities, but the transient sensitivities are still higher than any sensitivity you can back out of the IPCC’s models and forcings over the 1900-2100 period.
Incidentally, when I performed the same analysis a couple of years ago I found I was able to reproduce the GISS Ocean-Atmosphere model temperatures between 1900 and 2100 almost exactly by taking the observed and predicted CO2 concentrations, converting them into watts/sq m forcings using 5.35 * ln(C2/C1), multiplying these forcings by 0.55 and adding 13.8. And I did this in about two minutes on a computer that works at maybe a couple of kiloflops on a good day and at a cost of zero.

tom s

Good report and all Willis, but stand back and look at this folks….even those graphs that look to turn up so sharply we’re still talking just 1/10ths of a degree change here. I mean, really….panic in the streets. As if planet earth can’t flucuate by 0.6C in a 150yrs. Static she is not!

Nic

I’m a bit new to all of this AGW stuff, but I have a question about figure7. I look at the part 1850 to 1875. I see that natural forcing is relativly high and positive. Then I look at antropegnic forcing for the same period and I see that some part of it has a positive forcing too. So then I look at both combined for that period and see that the total forcing is smaller than the natural one. I was on the impression that 2 positive forcing would kind of “add up” and force even more. What am I missing here?

MikeN

Don’t you get high correlations when you compare differentials in data? And isn’t that a big nono?

Curiousgeorge

Quite an interesting headline , Willis. And a nice sense of humor you have. When I first read it, I thought it might have something to do with Victoria’s Secret. 🙂

DB

I notice that the solar forcing increases quite a bit from the beginning of the century up until about 1960. Is that considered realistic these days?

Stephen Singer

You’ve two figure 4’s. Did you mean 4a and 4b?