New paper: climate models short on 'physics required for realistic simulation of the Earth system'

divergence — The career path of a climate modeler – reality -vs- expectations Image: Dr Paul T. Thomas

I’m pleased to have had a chance to to review this new paper just published in the Journal of Climate:

An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models Suckling, Emma B., Leonard A. Smith, 2013: An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models*. J. Climate, 26, 9334–9347. doi: http://dx.doi.org/10.1175/JCLI-D-12-00485.1

The lead author, Emma Suckling, was kind enough to provide me with a copy for reading. This paper seeks to find the errors in the EU based ENSEMBLES project by hindcasting and evaluating the error. I was struck by the fact that in figure 2 below, there was broad disagreement between four models, with one having errors as large as 4.5 a decade out.

The conclusion rather says it all, these models just don’t have the physical processes of the dynamic and complex Earth captured yet, hence the photo I included above.

Abstract

While state-of-the-art models of Earth’s climate system have improved tremendously over the last 20 years, nontrivial structural flaws still hinder their ability to forecast the decadal dynamics of the Earth system realistically. Contrasting the skill of these models not only with each other but also with empirical models can reveal the space and time scales on which simulation models exploit their physical basis effectively and quantify their ability to add information to operational forecasts. The skill of decadal probabilistic hindcasts for annual global-mean and regional-mean temperatures from the EU Ensemble-Based Predictions of Climate Changes and Their Impacts (ENSEMBLES) project is contrasted with several empirical models. Both the ENSEMBLES models and a “dynamic climatology” empirical model show probabilistic skill above that of a static climatology for global-mean temperature. The dynamic climatology model, however, often outperforms the ENSEMBLES models. The fact that empirical models display skill similar to that of today’s state-of-the-art simulation models suggests that empirical forecasts can improve decadal forecasts for climate services, just as in weather, medium-range, and seasonal forecasting. It is suggested that the direct comparison of simulation models with empirical models becomes a regular component of large model forecast evaluations. Doing so would clarify the extent to which state-of-the-art simulation models provide information beyond that available from simpler empirical models and clarify current limitations in using simulation forecasting for decision support. Ultimately, the skill of simulation models based on physical principles is expected to surpass that of empirical models in a changing climate; their direct comparison provides information on progress toward that goal, which is not available in model–model intercomparisons.

================================================================

Introduction

State-of-the-art dynamical simulation models of Earth’s climate system1 are often used to make probabilistic pre- dictions about the future climate and related phenomena with the aim of providing useful information for decision support (Anderson et al. 1999; Met Office 2011; Weigela and Bowlerb 2009; Alessandri et al. 2011; Hagedorn et al. 2005; Hagedorn and Smith 2009; Meehl et al. 2009; Doblas-Reyes et al. 2010, 2011; Solomon et al. 2007; Reifen and Toumi 2009). Evaluating the performance of such predictions from a model or set of models is crucial not only in terms of making scientific progress but also in determining how much information may be available to decision makers via climate services. It is desirable to establish a robust and transparent approach to forecast evaluation, for the purpose of examining the extent to which today’s best available models are adequate over the spatial and temporal scales of interest for the task at hand. A useful reality check is provided by comparing the simulation models not only with other simulation models but also with empirical models that do not include direct physical simulation.

Decadal prediction brings several challenges for the design of ensemble experiments and their evaluation (Meehl et al. 2009; van Oldenborgh et al. 2012; Doblas- Reyes et al. 2010; Fildes and Kourentzes 2011; Doblas- Reyes et al. 2011); the analysis of decadal prediction

systems will form a significant focus of the Intergovernmental Panel on Climate Change (IPCC) Fifth Assess- ment Report (AR5). Decadal forecasts are of particular interest both for information on the impacts over the next 10 years, as well as from the perspective of climate model evaluation. Hindcast experiments over an archive of historical observations allow approaches from empirical forecasting to be used for model evaluation. Such approaches can aid in the evaluation of forecasts from simulation models (Fildes and Kourentzes 2011; van Oldenborgh et al. 2012) and potentially increase the practical value of such forecasts through blending fore- casts from simulation models with forecasts from empirical models that do not include direct physical simulation (Bro€cker and Smith 2008).

This paper contrasts the performance of decadal probability forecasts from simulation models with that of empirical models constructed from the record of available observations. Empirical models are unlikely to yield realistic forecasts for the future once climate change moves the Earth system away from the conditions observed in the past. A simulation model, which aims to capture the relevant physical processes and feedbacks, is expected to be at least competitive with the empirical model. If this is not the case in the recent past, then it is reasonable to demand evidence that those particular simulation models are likely to be more in- formative than empirical models in forecasting the near future.

A set of decadal simulations from the Ensemble- Based Predictions of Climate Changes and Their Impacts (ENSEMBLES) experiment (Hewitt and Griggs 2004; Doblas-Reyes et al. 2010), a precursor to phase 5 of the Coupled Model Intercomparison Project (CMIP5) de- cadal simulations (Taylor et al. 2009), is considered. The ENSEMBLES probability hindcasts are contrasted with forecasts from empirical models of the static climatology, persistence, and a ‘‘dynamic climatology’’ model de- veloped for evaluating other dynamical systems (Smith 1997; Binter 2012). Ensemble members are transformed into probabilistic forecasts via kernel dressing (Bro€cker and Smith 2008); their quality is quantified according to several proper scoring rules (Bro€cker and Smith 2006). The ENSEMBLES models do not demonstrate significantly greater skill than that of an empirical dynamic climatology model either for global-mean temperature or for the land-based Giorgi region2 temperatures (Giorgi 2002).

It is suggested that the direct comparison of simulation models with empirical models become a regular component of large model forecast evaluations. The methodology is easily adapted to other climate fore- casting experiments and can provide a useful guide to decision makers about whether state-of-the-art fore- casts from simulation models provide additional in- formation to that available from easily constructed empirical models.

An overview of the ENSEMBLES models used for decadal probabilistic forecasting is discussed in section 2. The appropriate choice of empirical model for probabilistic decadal predictions forms the basis of section 3, while section 4 contains details of the evaluation frame- work and the transformation of ensembles into probabilistic forecast distributions. The performance of the ENSEMBLES decadal hindcast simulations is pre- sented in section 5 and compared to that of the empirical models. Section 6 then provides a summary of conclu- sions and a discussion of their implications. The supplementary material includes graphics for models not shown in the main text, comparisons with alternative empirical models, results for regional forecasts, and the application of alternative (proper) skill scores. The basic conclusion is relatively robust: the empirical dynamic climatology (DC) model often outperforms the simulation models in terms of probability forecasting of temperature.

============================================================

suckling_fig2 — FIG. 2. Mean forecast error as a function of lead time across the set of decadal hindcasts for each of the ENSEMBLES simulation models as labeled. Note that the scale on the vertical axis for the ARPEGE4/OPA model is different than for the other models, reflecting the larger bias in this model.

Conclusions

The quality of decadal probability forecasts from the ENSEMBLES simulation models has been compared with that of reference forecasts from several empirical models. In general, the stream 2 ENSEMBLES simu- lation models demonstrate less skill than the empirical DC model across the range of lead times from 1 to 10 years. The result holds for a variety of proper scoring rules including ignorance (Good 1952), the proper linear score (PL) (Jolliffe and Stephenson 2003), and the continuous ranked probability score (CRPS) (Bro€cker and Smith 2006). A similar result holds on smaller spatial scales for the Giorgi regions (see supplementary material). These new results for probability forecasts are consistent with evaluations of root-mean-square errors of decadal simulation models with other reference point forecasts (Fildes and Kourentzes 2011; van Oldenborgh et al. 2012; Weisheimer et al. 2009). The DC probability forecasts often place up to 4 bits more information (or 24 times more probability mass) on the observed outcome than the ENSEMBLES simulation models.

In the context of climate services, the comparable skill of simulation models and empirical models suggests that the empirical models will be of value for blending with simulation model ensembles; this is already done in ensemble forecasts for the medium range and on seasonal lead times. It also calls into question the extent to which current simulation models successfully capture the physics required for realistic simulation of the Earth system and can thereby be expected to provide robust, reliable predictions (and, of course, to outperform empirical models) on longer time scales.

The evaluation and comparison of decadal forecasts will always be hindered by the relatively small samples involved when contrasted with the case of weather forecasts; the decadal forecast–outcome archive currently considered is only half a century in duration. Advances both in modeling and in observation, as well as changes in Earth’s climate, are likely to mean the relevant forecast–outcome archive will remain small. One improvement that could be made to clarify the skill of the simulation models is to improve the experimental design of hindcasts: in particular, to increase the ensemble size used. For the ENSEMBLES models, each simulation ensemble consisted of only three members launched at 5 years intervals. Larger ensembles and more frequent forecast launch dates can ease the evaluation of skill without waiting for the forecast–outcome archive to grow larger.9

The analysis of hindcasts can never be interpreted as an out-of-sample evaluation. The mathematical structure of simulation models, as well as parameterizations and parameter values, has been developed with knowledge of the historical data. Empirical models with a simple mathematical structure suffer less from this effect. Prelaunch empirical models based on the DC structure and using only observations before the fore- cast launch date also outperform the ENSEMBLES simulation models. This result is robust over a range of ensemble interpretation parameters (i.e., variations in the kernel width used). Both prelaunch trend models and persistence models are less skillful than the DC models considered.

The comparison of near-term climate probability forecasts from Earth simulation models with those from dynamic climatology empirical models provides a useful benchmark as the simulation models improve in the future. The blending (Bro€cker and Smith 2008) of simulation models and empirical models is likely to provide more skillful probability forecasts in climate services, for both policy and adaptation decisions. In addition, clear communication of the (limited) expectations for skillful decadal forecasts can avoid casting doubt on well-founded physical understanding of the radiative response to increasing carbon dioxide concentration in Earth’s atmosphere. Finally, these comparisons cast a sharp light on distinguishing whether current limitations in estimating the skill of a model arise from external factors like the size of the forecast–outcome archive or from the experimental design. Such insights are a valuable product of ENSEMBLES and will contribute to the experimental design of future ensemble decadal prediction systems.

0 0 votes

Article Rating

46 Comments

Inline Feedbacks

View all comments

Berényi Péter

November 28, 2013 9:02 am

Computational climate models’ failure to exhibit observed interhemispheric symmetry in reflected shortwave radiation is a much more serious issue, than deviation from climate projections for a few decades. It indicates directly some missing physics in theory underlying all models, not just implementation bugs, therefore it is not even correctable until a general theory of irreproducible quasy stationary non equilibrium thermodynamic systems emerges, backed by actual experiments performed on members of said class.
Journal of Climate, Volume 26, Issue 2 (January 2013)
doi: 10.1175/JCLI-D-12-00132.1
The Observed Hemispheric Symmetry in Reflected Shortwave Irradiance
Aiko Voigt, Bjorn Stevens, Jürgen Bader and Thorsten Mauritsen

thallstd

November 28, 2013 9:03 am

And further…
According to Raymond W. Schmitt, a senior scientist of the Department of Physical Oceanography at Woods Hole Oceanographic Institution, it will take about 162 years to get adequate resolution in computer models of the ocean. Of course, that was in 2000, so less than 150 years to go now…
” It will take a factor of 108 improvement in 2 horizontal dimensions (100 km to 1 mm, the salt dissipation scale), a factor of 106 in the vertical dimension (~10 levels to 107) and ~105 in time (fraction of a day to fraction of a second); an overall need for an increase in computational power of ~1027. With an order of magnitude increase in computer speed every 6 years, it will take 162 years to get adequate resolution in computer models of the ocean. ”
http://www.whoi.edu/page.do?pid=8916&tid=282&cid=24777

lemiere jacques

November 28, 2013 9:04 am

hey you don’t understand the point, in fact we have to ‘trust’ the model, and to test a model is a proof of lack of trust.

Bob Tisdale

Editor

November 28, 2013 9:06 am

Was Suckling et al mentioned in AR5?

Pamela Gray

November 28, 2013 9:07 am

Ah Ha! Same is true for ENSO forecasts. The mean of the statistical models has been consistently outperforming the dynamical models so much so that the “consensus” prediction has been nearly identical to the statistical forecasts over the past year.

Billy Liar

November 28, 2013 9:08 am

She’s not a climatologist. She’s not qualified to say anything about climate models!
http://cats.lse.ac.uk/homepages/ema/cv.php
She may expect ‘The Wrath of Mann’.
/sarc and with apologies to Star Trek™.

Hockey Schtick

November 28, 2013 9:09 am

A scientific ‘no change in temperature’ model outperforms IPCC climate models by factor of 7
http://hockeyschtick.blogspot.com/2013/10/a-scientific-no-change-in-temperature.html
New paper finds simple laptop computer program reproduces the flawed climate projections of supercomputer climate models
http://hockeyschtick.blogspot.com/2013/11/new-paper-finds-simple-laptop-computer.html
New paper shows the ‘simple basic physics’ of greenhouse theory exaggerate global warming by a factor of 8 times
http://hockeyschtick.blogspot.com/2013/11/new-paper-shows-simple-basic-physics-of.html

Billy Liar

November 28, 2013 9:15 am

Here is a copy of the creed for the project she is currently involved in:
The Truth About Global Warming
The following statement from the UK Science Community, dated December 10th 2009 was signed by over 1700 UK scientists.
We, members of the UK science community, have the utmost confidence in the observational evidence for global warming and the scientific basis for concluding that it is due primarily to human activities. The evidence and the science are deep and extensive. They come from decades of painstaking and meticulous research, by many thousands of scientists across the world who adhere to the highest levels of professional integrity. That research has been subject to peer review and publication, providing traceability of the evidence and support for the scientific method. The science of climate change draws on fundamental research from an increasing number of disciplines, many of which are represented here. As professional scientists, from students to senior professors, we uphold the findings of the IPCC Fourth Assessment Report, which concludes that “Warming of the climate system is unequivocal” and that “Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations”.
http://www.equip.leeds.ac.uk/the-truth-about-global-warming/
They forgot the ‘Amen’ at the end.

Pamela Gray

November 28, 2013 9:22 am

Love the spin in the abstract and introduction of the similar ENSO paper not matching the conclusions at the end. Why does this not matter to AGWers? Because the abstract is what gets published on open access form, never the conclusion section. I imagine there must be some PR expert now employed in each Ivory Tower that helps “dress” the abstract for meet and greet sessions.
http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-11-00111.1

Pamela Gray

November 28, 2013 9:35 am

Scroll down to page 26 for a peek at the consensus forecast. Notice how close it is to the statistical forecast and how far away it is from the dynamical forecast (didn’t used to be that way). In the paper I linked to above, it was mentioned that statistical models improve with time because of increased amount of data available (we have lots of holes in the historical data that are now being filled in because of better coverage across the globe). At that time, the dynamical models were slightly better than the statistical models (BUT worse than simpler older dynamical models from the 80’s and 90’s).
Dynamical models continue to suck because of the static nature of their biased input (set the dials and forget about it). Eventually statistical models will prove to be superior in every aspect to dynamical models because of ever increasing amounts of observational data input, unless someone finds the holy grail of general circulation model dynamics that can be set to mathematical equation.
http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/lanina/enso_evolution-status-fcsts-web.pdf

Pamela Gray

November 28, 2013 9:37 am

By the way, I love the improved and highly educational International Research Institute website here linked.
http://iri.columbia.edu/our-expertise/climate/forecasts/

jorgekafkazar

November 28, 2013 10:07 am

A model of a chaotic system must be chaotic.
Chaos represents randomness.
Randomness is lack of order.
Lack of order is lack of knowledge.
Lack of knowledge is ignorance.
As the models improve, they must come closer to describing chaos.
Improved models will incorporate more and more ignorance.
An ensemble of ignorance is still ignorance.

Mike Maguire

November 28, 2013 10:42 am

Refreshing to see a paper that acknowledges serious problems with climate models. Even with massive improvements, one issue that can’t be resolved is sample size.
Hind cast simulations using better physics will lessen disparities between the models and the real world with time. A 2013 model for instance would be able to replicate the past 20 years much better than any 1993 models did projecting 20 years out from that time. However, judging the new model’s performance with a similar 20 year period……….takes 20 years.
Consider weather models predicting 2 weeks out for instance. The GFS, being updated every 4 hours, has 365- 14 days X 4 =1,404 separate model runs/predictions out to 2 weeks that can be compared with the real world for each entire 2 week period.
A climate model going out 20 years and updated every year would need 1,404 +20 = 1,424 years to have the same sample size.
A climate modeler, building his first 50 year global climate model at 30 years old in the year 1990, will be 80 when it’s performance can be completely evaluated in 2040. Even today’s slow to respond climate modelers will have long since trashed those models by then. The long term nature of climate models/lack of timely replacements and extreme bias of those using them has caused denial of their complete failure for over a decade………a significant span, which greatly diminishes their value at this point in time.

Alec Rawls

Editor

November 28, 2013 10:57 am

A key question is whether the empirical models are projecting solar-magnetic effects in accordance with their past statistical explanatory power. This is where the IPCC’s reliance on its particular physical models is positively anti-scientific. The IPCC acknowledges (barely) that correlations between solar activity and climate indicate a more powerful influence on climate than be accounted for by the very slight variation in TSI, but TSI variation is the only influence they include in their models. They are not satisfied with the available theories of how a more powerful solar driver of climate could be operating and so they ignore the evidence that such a driver is at work, which is an exact inversion of the very definition of science: that evidence trumps theory, not vice versa.
Now that the sun has gone quiet, scientific reasoning predicts that the unscientific physical-model predictions of temperature should be too high. It would be interesting to see whether this is indeed where the better performance of the referenced empirical models is coming from: are they incorporating a stronger solar effect than the physical models, or are they also weighing solar effects below their historical explanatory power but are deviating from the physical models elsewhere?
Is anyone here familiar with the empirical models that are being used, and how much weight they give to solar activity?

Theo Goodwin

November 28, 2013 11:04 am

“It also calls into question the extent to which current simulation models successfully capture the physics required for realistic simulation of the Earth system and can thereby be expected to provide robust, reliable predictions (and, of course, to outperform empirical models) on longer time scales.”
How can a model capture the physics? I know how a model can reproduce a curve on a graph, but that has nothing to do with physics or physical theory. My point is that the concepts at the foundation of IPCC-speak are just hopelessly ambiguous. Someone serious who has the time should come up with a description of what can be done with models regarding the physics of climate.

Genghis

November 28, 2013 11:11 am

The main problem with all of the ‘physics’ based programs is that they are all constrained by a radiative imbalance of 1.5 W/M^2. Over time the accumulated energy has to create higher temperatures. That is why they all have logrithmic temperature increases if the projections go out far enough. I don’t believe the statistic based programs have the 1.5 W/M^2 problem and that is why they are more accurate.
I personally think that we are close to creating a ‘physics’ based model based on first principles that has a shot at being able to make accurate predictions and it would be a low resolution fairly simple program. We just have to clear out some bad assumptions and data and correctly identify the relevant drivers (primarily water).

phlogiston

November 28, 2013 11:17 am

‘physics required for realistic simulation of the Earth system’
Physics and biology are required for realistic simulation of the Earth system.

Tagerbaek

November 28, 2013 11:33 am

Hahaha, great satire, perfect dadaism, did this come from the postmodernism generator website?

Mike Maguire

November 28, 2013 11:36 am

With regard to weather models, you may find these links interesting.
http://www.emc.ncep.noaa.gov/GFS/perf.php
A good summary of how far weather models have come is found by going to the 2009 review of GFS forecast skill.
“Skill of 5-day forecasts has doubled in 20 years in the Southern Hemisphere and 25 years in the Northern Hemisphere”
Another link:
http://www.hpc.ncep.noaa.gov/html/model2.shtml#diagnostics
Meteorologists analyze and evaluate all aspects of these models constantly/daily, use experimental models and update the physics of models from time to time(every few years?), increasing skill greatly the last 2 decades.
With climate models, on the other hand, reality and output of expected global temperature has been going the other/opposite way the last decade.
http://www.drroyspencer.com/2013/06/still-epic-fail-73-climate-models-vs-measurements-running-5-year-means/

u.k.(us)

November 28, 2013 11:37 am

Fool me once shame on you.
Fool me twice shame on me.
Quite the take-down of the current science.

ecoGuy

November 28, 2013 11:37 am

Modeling so far out will never work successfully. The simple reason being you do not have sufficient accuracy and coverage of the system being modeled to be able to stop long term divergence. Combine that with random actions (or non observable inputs, i.e. what happening underground) and you don’t have a bats chance in hell of being accurate over the longer term. Then add in the impacts weather events have the system (cyclones leading to land coverage changes, pollution events, etc) and again your model is well off track due to an ‘error’ you cannot predict.
The modellers seem to have forgotten that only the error rate accumulates over time, accuracy never does. Plus you can only reliably model a system if you can properly ‘box’ it (account for all inputs and states) – at this rate one would have to box the whole solar system and find a way of modelling hidden currently unobservable states – good luck!

Genghis

November 28, 2013 11:43 am

phlogiston says:
November 28, 2013 at 11:17 am
‘physics required for realistic simulation of the Earth system’
Physics and biology are required for realistic simulation of the Earth system.
……………………………………..
You are being way too optimistic. The first step should be to correctly model the energy flux generally through the system starting with Ocean currents, winds and clouds. Have to nail down the main drivers first. Biological effect is going to be relatively small on a global scale (huge locally though). Got to start with the basics and a stable climate model first.

Genghis

November 28, 2013 11:56 am

ecoGuy says:
November 28, 2013 at 11:37 am
The modellers seem to have forgotten that only the error rate accumulates over time, accuracy never does. Plus you can only reliably model a system if you can properly ‘box’ it (account for all inputs and states) – at this rate one would have to box the whole solar system and find a way of modelling hidden currently unobservable states – good luck!
The way Models are currently written you are correct. But programs can be taught to learn and make guesses about future data. Data Compression models do it all the time, as do Chess and Go Programs. Maybe it is time for the gamers to show the newbies how it is done.

Mike Maguire

November 28, 2013 12:13 pm

phlogiston says:
November 28, 2013 at 11:17 am
“Physics and biology are required for realistic simulation of the Earth system”
Great point. The accelerated rate of photosynthesis, greening of the planet has profound effects that are a game changer.
The increase in evapotranspiration in the United States Midwest during the growing season, from tightly packed rows of corn, increases dew points by up to 5 degrees at times and creates a micro climate over the size of half a dozen states. The added moisture from these crops causes numerous weather changes that include heavier/more rains, lower LCL(lifting condensation levels) and formation of cumulus earlier in the day(cooling effect during the day).
Warmer/muggier nights and positive feedback effects on the water cycle.
This is a clear example of a micro climate created by doubling the concentration of corn plants over the last 3 decades.
Increasing CO2 has caused the planets biosphere and vegetation to experience explosive growth. Just the increase in evapotranspiration alone from the explosive plant growth is causing significant changes to our planets climate from biology.
Interesting article:
https://www2.ucar.edu/atmosnews/opinion/4997/corn-and-climate-sweaty-topic
“Computer models also take vegetation into account. As used by the National Weather Service, the Weather Research and Forecasting model—which divides the United States land area into rectangles roughly 7.5 miles (12 kilometers) on each side—incorporates daily satellite data on the greenness of the landscape within each rectangle (though not on specific plant types). The model then assesses how much water will enter the atmosphere via the vegetation in each grid box. Forecasters can adjust the resulting model guidance based on their knowledge of local planting patterns and crop behavior.”

Latimer Alder

November 28, 2013 12:13 pm

Translation: Climate models are unfit for any purpose beyond acting as a job creation scheme for geeks.
But on a lighter note – can anybody give a quick summary of something that climate models *are* useful for? There seems to be a huge dearth of actual examples where they are any good at anything at all.

1 2 Next »

wpDiscuz

Conclusions

Related Posts

Turning “What If” into “How Many”: The Rhetorical Alchemy of Climate Modeling

Models & Lab Studies

Another “Model-Based” Methane Scare Story: Why It Doesn’t Hold Up to Scrutiny

Testing A Constructal Climate Model