New paper: climate models short on 'physics required for realistic simulation of the Earth system'

divergence
The career path of a climate modeler – reality -vs- expectations Image: Dr Paul T. Thomas

I’m pleased to have had a chance to to review this new paper just published in the Journal of Climate:

An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models Suckling, Emma B., Leonard A. Smith, 2013: An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models*. J. Climate, 26, 9334–9347. doi: http://dx.doi.org/10.1175/JCLI-D-12-00485.1

The lead author, Emma Suckling, was kind enough to provide me with a copy for reading. This paper seeks to find the errors in the EU based ENSEMBLES project by hindcasting and evaluating the error. I was struck by the fact that in figure 2 below, there was broad disagreement between four models, with one having errors as large as 4.5 a decade out.

The conclusion rather says it all, these models just don’t have the physical processes of the dynamic and complex Earth captured yet, hence the photo I included above.

Abstract

While state-of-the-art models of Earth’s climate system have improved tremendously over the last 20 years, nontrivial structural flaws still hinder their ability to forecast the decadal dynamics of the Earth system realistically. Contrasting the skill of these models not only with each other but also with empirical models can reveal the space and time scales on which simulation models exploit their physical basis effectively and quantify their ability to add information to operational forecasts. The skill of decadal probabilistic hindcasts for annual global-mean and regional-mean temperatures from the EU Ensemble-Based Predictions of Climate Changes and Their Impacts (ENSEMBLES) project is contrasted with several empirical models. Both the ENSEMBLES models and a “dynamic climatology” empirical model show probabilistic skill above that of a static climatology for global-mean temperature. The dynamic climatology model, however, often outperforms the ENSEMBLES models. The fact that empirical models display skill similar to that of today’s state-of-the-art simulation models suggests that empirical forecasts can improve decadal forecasts for climate services, just as in weather, medium-range, and seasonal forecasting. It is suggested that the direct comparison of simulation models with empirical models becomes a regular component of large model forecast evaluations. Doing so would clarify the extent to which state-of-the-art simulation models provide information beyond that available from simpler empirical models and clarify current limitations in using simulation forecasting for decision support. Ultimately, the skill of simulation models based on physical principles is expected to surpass that of empirical models in a changing climate; their direct comparison provides information on progress toward that goal, which is not available in model–model intercomparisons.

================================================================

Introduction

State-of-the-art dynamical simulation models of Earth’s climate system1 are often used to make probabilistic pre- dictions about the future climate and related phenomena with the aim of providing useful information for decision support (Anderson et al. 1999; Met Office 2011; Weigela and Bowlerb 2009; Alessandri et al. 2011; Hagedorn et al. 2005; Hagedorn and Smith 2009; Meehl et al. 2009; Doblas-Reyes et al. 2010, 2011; Solomon et al. 2007; Reifen and Toumi 2009). Evaluating the performance of such predictions from a model or set of models is crucial not only in terms of making scientific progress but also in determining how much information may be available to decision makers via climate services. It is desirable to establish a robust and transparent approach to forecast evaluation, for the purpose of examining the extent to which today’s best available models are adequate over the spatial and temporal scales of interest for the task at hand. A useful reality check is provided by comparing the simulation models not only with other simulation models but also with empirical models that do not include direct physical simulation.

Decadal prediction brings several challenges for the design of ensemble experiments and their evaluation (Meehl et al. 2009; van Oldenborgh et al. 2012; Doblas- Reyes et al. 2010; Fildes and Kourentzes 2011; Doblas- Reyes et al. 2011); the analysis of decadal prediction

systems will form a significant focus of the Intergovernmental Panel on Climate Change (IPCC) Fifth Assess- ment Report (AR5). Decadal forecasts are of particular interest both for information on the impacts over the next 10 years, as well as from the perspective of climate model evaluation. Hindcast experiments over an archive of historical observations allow approaches from empirical forecasting to be used for model evaluation. Such approaches can aid in the evaluation of forecasts from simulation models (Fildes and Kourentzes 2011; van Oldenborgh et al. 2012) and potentially increase the practical value of such forecasts through blending fore- casts from simulation models with forecasts from empirical models that do not include direct physical simulation (Bro€cker and Smith 2008).

This paper contrasts the performance of decadal probability forecasts from simulation models with that of empirical models constructed from the record of available observations. Empirical models are unlikely to yield realistic forecasts for the future once climate change moves the Earth system away from the conditions observed in the past. A simulation model, which aims to capture the relevant physical processes and feedbacks, is expected to be at least competitive with the empirical model. If this is not the case in the recent past, then it is reasonable to demand evidence that those particular simulation models are likely to be more in- formative than empirical models in forecasting the near future.

A set of decadal simulations from the Ensemble- Based Predictions of Climate Changes and Their Impacts (ENSEMBLES) experiment (Hewitt and Griggs 2004; Doblas-Reyes et al. 2010), a precursor to phase 5 of the Coupled Model Intercomparison Project (CMIP5) de- cadal simulations (Taylor et al. 2009), is considered. The ENSEMBLES probability hindcasts are contrasted with forecasts from empirical models of the static climatology, persistence, and a ‘‘dynamic climatology’’ model de- veloped for evaluating other dynamical systems (Smith 1997; Binter 2012). Ensemble members are transformed into probabilistic forecasts via kernel dressing (Bro€cker and Smith 2008); their quality is quantified according to several proper scoring rules (Bro€cker and Smith 2006). The ENSEMBLES models do not demonstrate significantly greater skill than that of an empirical dynamic climatology model either for global-mean temperature or for the land-based Giorgi region2 temperatures (Giorgi 2002).

It is suggested that the direct comparison of simulation models with empirical models become a regular component of large model forecast evaluations. The methodology is easily adapted to other climate fore- casting experiments and can provide a useful guide to decision makers about whether state-of-the-art fore- casts from simulation models provide additional in- formation to that available from easily constructed empirical models.

An overview of the ENSEMBLES models used for decadal probabilistic forecasting is discussed in section 2. The appropriate choice of empirical model for probabilistic decadal predictions forms the basis of section 3, while section 4 contains details of the evaluation frame- work and the transformation of ensembles into probabilistic forecast distributions. The performance of the ENSEMBLES decadal hindcast simulations is pre- sented in section 5 and compared to that of the empirical models. Section 6 then provides a summary of conclu- sions and a discussion of their implications. The supplementary material includes graphics for models not shown in the main text, comparisons with alternative empirical models, results for regional forecasts, and the application of alternative (proper) skill scores. The basic conclusion is relatively robust: the empirical dynamic climatology (DC) model often outperforms the simulation models in terms of probability forecasting of temperature.

============================================================

suckling_fig2
FIG. 2. Mean forecast error as a function of lead time across the set of decadal hindcasts for each of the ENSEMBLES simulation models as labeled. Note that the scale on the vertical axis for the ARPEGE4/OPA model is different than for the other models, reflecting the larger bias in this model.

Conclusions

The quality of decadal probability forecasts from the ENSEMBLES simulation models has been compared with that of reference forecasts from several empirical models. In general, the stream 2 ENSEMBLES simu- lation models demonstrate less skill than the empirical DC model across the range of lead times from 1 to 10 years. The result holds for a variety of proper scoring rules including ignorance (Good 1952), the proper linear score (PL) (Jolliffe and Stephenson 2003), and the continuous ranked probability score (CRPS) (Bro€cker and Smith 2006). A similar result holds on smaller spatial scales for the Giorgi regions (see supplementary material). These new results for probability forecasts are consistent with evaluations of root-mean-square errors of decadal simulation models with other reference point forecasts (Fildes and Kourentzes 2011; van Oldenborgh et al. 2012; Weisheimer et al. 2009). The DC probability forecasts often place up to 4 bits more information (or 24 times more probability mass) on the observed outcome than the ENSEMBLES simulation models.

In the context of climate services, the comparable skill of simulation models and empirical models suggests that the empirical models will be of value for blending with simulation model ensembles; this is already done in ensemble forecasts for the medium range and on seasonal lead times. It also calls into question the extent to which current simulation models successfully capture the physics required for realistic simulation of the Earth system and can thereby be expected to provide robust, reliable predictions (and, of course, to outperform empirical models) on longer time scales.

The evaluation and comparison of decadal forecasts will always be hindered by the relatively small samples involved when contrasted with the case of weather forecasts; the decadal forecast–outcome archive currently considered is only half a century in duration. Advances both in modeling and in observation, as well as changes in Earth’s climate, are likely to mean the relevant forecast–outcome archive will remain small. One improvement that could be made to clarify the skill of the simulation models is to improve the experimental design of hindcasts: in particular, to increase the ensemble size used. For the ENSEMBLES models, each simulation ensemble consisted of only three members launched at 5 years intervals. Larger ensembles and more frequent forecast launch dates can ease the evaluation of skill without waiting for the forecast–outcome archive to grow larger.9

The analysis of hindcasts can never be interpreted as an out-of-sample evaluation. The mathematical structure of simulation models, as well as parameterizations and parameter values, has been developed with knowledge of the historical data. Empirical models with a simple mathematical structure suffer less from this effect. Prelaunch empirical models based on the DC structure and using only observations before the fore- cast launch date also outperform the ENSEMBLES simulation models. This result is robust over a range of ensemble interpretation parameters (i.e., variations in the kernel width used). Both prelaunch trend models and persistence models are less skillful than the DC models considered.

The comparison of near-term climate probability forecasts from Earth simulation models with those from dynamic climatology empirical models provides a useful benchmark as the simulation models improve in the future. The blending (Bro€cker and Smith 2008) of simulation models and empirical models is likely to provide more skillful probability forecasts in climate services, for both policy and adaptation decisions. In addition, clear communication of the (limited) expectations for skillful decadal forecasts can avoid casting doubt on well-founded physical understanding of the radiative response to increasing carbon dioxide concentration in Earth’s atmosphere. Finally, these comparisons cast a sharp light on distinguishing whether current limitations in estimating the skill of a model arise from external factors like the size of the forecast–outcome archive or from the experimental design. Such insights are a valuable product of ENSEMBLES and will contribute to the experimental design of future ensemble decadal prediction systems.

 

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

46 Comments
Inline Feedbacks
View all comments
RockyRoad
November 28, 2013 1:03 pm

We have much to be thankful for this Thanksgiving Day!
Watching “The Team” squirm is one thing I’m very thankful for.

November 28, 2013 2:43 pm

The science is now settled on a bed of quick sand.

Claude Harvey
November 28, 2013 2:53 pm

I cannot think of any scientific endeavor that has gone on for such a long period of time, at such great expense and employed so many people while producing so little practical results as climate modeling. Couch it any way you wish, but the bottom line is PATHETIC!

November 28, 2013 2:57 pm

Also Prof Michael Beenstock (Hebrew University) has a team starting to test all 35 or so GCMs. He explained yesterday, in a lecture at the IEA, that getting the programme data for each is like drawing teeth. So far tests have been done on six of them:
Results:
“Errors don’t mean revert”
HEGY (-4.5) KPSS (0.25)
ACCESS -1.6 1.36
CCSM5 -1.67 7.48
CSIRO -1.63 3.76
GISS-R -1.66 0.524
GISS-H -1.70 3.25
HADLEY -1.58 8.48
Which, I understand, means they fail dismally. Full results will not be available (because it takes so much time getting what should be available to anyone but isn’t) for another year.

November 28, 2013 2:59 pm

Sorry the table hasn’t lined up properly: HEGY and KPSS should be over the first and second column of figures respectively.

Policycritic
November 28, 2013 3:57 pm

thallstd says:
November 28, 2013 at 9:03 am
And further…
According to Raymond W. Schmitt, a senior scientist of the Department of Physical Oceanography at Woods Hole Oceanographic Institution, it will take about 162 years to get adequate resolution in computer models of the ocean. Of course, that was in 2000, so less than 150 years to go now…
”It will take a factor of 108 improvement in 2 horizontal dimensions (100 km to 1 mm, the salt dissipation scale), a factor of 106 in the vertical dimension (~10 levels to 107) and ~105 in time (fraction of a day to fraction of a second); an overall need for an increase in computational power of ~1027. With an order of magnitude increase in computer speed every 6 years, it will take 162 years to get adequate resolution in computer models of the ocean.”
http://www.whoi.edu/page.do?pid=8916&tid=282&cid=24777

Excellent reference (link) to Raymond W. Schmitt’s Testimony to the Senate Committee on Commerce, Science and Transportation on The Ocean’s Role in Climate.
Thank you.

November 28, 2013 3:58 pm

Model simulations cannot be accurate because they miss five macrodrivers
in their input. Small change input can be left out, but not five major
macrodrivers: http://www.knowledgeminer.eu/eoo_paper.html
No wonder, of 112 models, only 3 remain within the confidence range! In a few years,
all of them are, plainly speaking: “wrong”. No wonder, typical professoral ineptitude.

son of mulder
November 28, 2013 4:17 pm

No model will work long term apart from the earth itself. That’s because the behaviour is chaotic. You cannot model chaos long term not even using probabalistic models, dynamic models, ensemble models. Get used to it. You can’t model it. Anyone who suggests they can predict Europe’s, Asia’s, America’s, or Polar future climate change is a charlatan.

November 28, 2013 4:25 pm

Well you can encounter chaos and once in the grip you may end up in a chaotic condition yourself where you can not even measure yourself.
The earths climate is a chaotic combat zone.

November 28, 2013 4:42 pm

Genghis writes “The first step should be to correctly model the energy flux generally through the system starting with Ocean currents, winds and clouds. Have to nail down the main drivers first.”
Thats fine for weather forecasts but for climate where you’re modelling how things fundamentally change due to subtle influences over long durations then you need everything modelled perfectly. That will never happen. Not in my lifetime anyway.

Leo Geiger
November 28, 2013 5:56 pm

Perhaps this was lost in the selective bold high-lighting:

In addition, clear communication of the (limited) expectations for skillful decadal forecasts can avoid casting doubt on well-founded physical understanding of the radiative response to increasing carbon dioxide concentration in Earth’s atmosphere.

One can only hope. We don’t have to know everything to know enough.

Editor
November 28, 2013 6:47 pm

Would someone like to translate “ENSEMBLES models”, “dynamic climatology empirical model”, and “static climatology” into plain English for me? TIA.

Paul Vaughan
November 28, 2013 8:12 pm

Berényi Péter (November 28, 2013 at 9:02 am)
Thanks for the link — much appreciated.
Here’s something else to consider.

Steven R Vada
November 28, 2013 11:36 pm

Here’s one I want to see modeled: a sphere suspended in a vacuum is illuminated by a light, and covered with sensors.
The vacuum temp is stabilized and recorded.
an bath of reflective gas insulation is injected around the sphere. The reflective gas stops 20% of the energy from ever reaching any of the spheres’ sensors.
The climate model had better show
the sphere being cooler than when the additional 20% light was illuminating it.
Instead of the current paradigm that accompanies this dumpster sized bucket of feces called current Climateur Inversion Fraud.
The second thing I want to see modeled is the concept of that identical bath, being quite cold in relation to the sphere; indeed I want to bath to be a lot colder than the sphere and I want the sphere subsequently suspended in the cold bath,
and then
I want to see every sensor on that sphere rise, yet again;
in spite of being immersed in an ice cold thermally conductive gas bath.
You know: the way the CURRENT RESEARCH SHOWS IT GOING in MAGIC GAiS WERLD.
UNTIL THAT TIME,
Climateur Pseudo-science is not representing the earth’s system. The atmosphere is a cold thermally conductive bath, augmented by convection.
Not maybe, not Tuesday, not when the Prime Minister jiggles the yen so he can buy himself a new estate in the mountains.
That’s how it is climate kids. The earth’s atmosphere’s a thermally conductive, cold bath, that starts out reflecting 20ish percent f the sun’s total energy away from ever even being near a thermal sensor on earth.
But “Evurbody arownd heeyur nos the atmusfear warms yew up. Speshlie at night.”
Yeah: and when I close the door of my refrigerator and it gets dark in there, the cold air conductively and convectively removing heat from my two liters of soda, suddenly starts backerdistically warming my soda pop instead of cooling it the way I built the refrigeration unit to do it.
The entire conceptual framework of Backerdistical Magic Gaisism is an
i.n.v.e.r.s.i.o.n.
of
r.e.a.l.i.t.y.
and when you don’t see any ‘reality’ come out of “Backerdistical Back-&-Forthisms what caint no insturmunt mayzur”
and when you see it’s most staunch defenders denying quantized energy in matter so they can claim radiant energy enters solids that are already filled to that frequency emissions,
you know you’ve got your self a fraud so large people deny you can quantize how much energy matter holds at any given temperature.
That’s somewhere past insane and a long way into willing fraud.
Along the way one of these pinhead twits needs to just jot on a dinner napkin how he figures that adding more,
of the class gas removing that original 20ish percent of energy in, to the sphere in his
‘Magic Melter Model’
is going to block 21, then 22, then 23, 4, 5 % energy in total
and make the temp rise even more.
Now mind yas I recognize perfectly well the last question’s just an extension of the first one,
but I’m here to tell each and every one of you that if you don’t realize the atmosphere is actually a cold, thermally conductive bath,
a bunch of government employees told you immersion into which,
adding it’s additional conduction, and convection, to the radiation which would be solely available without it –
then you’re so far behind the real life scientific curve, you believed it when somebody told you, that when you turn off a light in a refrigerator (at night for earth)
the frigid, conductive, and convective gas mixture in your refrigerator,
starts ‘warming’ your soda.
=======
Now: you are listening to the brainless drivel of men who would tell you immersion into a thermally conductive cold bath
removes heat slower than no bath at all.
You’re listening to the criminopathological scammings of men who have insisted the whole world believe adding more of the gas that blocks a fifth of the sun’s energy in total,
by blocking more sunlight in,
will of course make it yet hotter, on every heat sensor on earth or so;
because of course,
when you added the reflective CO2 and Water the first time,
blocking that initial 20% total energy in,
you “made all the sensors on the globe register that it got warmer on earth”
than when more energy was arriving, at the energy sensors.
It’s a scam from the first word to the last, and if you think what I’m saying’s an inversion itself, I suggest you sit down and give the matter a big long think: and ask yourself why every time one of these Magic Gais Mavens speaks, he winds up sounding like he’s tripping on peyote.
S.R.V.

November 28, 2013 11:36 pm

There seems to be a problem with HadGEM2 in any case. See correspondence between UK Met Office and Nic Lewis.
http://niclewis.files.wordpress.com/2013/09/metoffice_response2g.pdf

Brian H
November 29, 2013 12:23 am

Ultimately, the skill of simulation models based on physical principles is expected to surpass that of empirical models in a changing climate

The menu of available “physical principles” will have to expand and be revised considerably before any such thing happens, I suspect.

Steven R Vada
November 29, 2013 12:29 am

Oops sorry “a bunch of government employees told you immersion into which,
adding it’s additional conduction, and convection, to the radiation which would be solely available without it –
should have been
“to the radiation which would be solely available without it – *made everything washed in it get hotter”*
I fumbled that & missed it.

November 29, 2013 8:33 am

This paper is too generous to their dynamic modeling colleagues. A major refinement in model performance is possible right now. The IPCC ensembles consistently over estimate forecast temperatures (I know, I know, ‘projections’) over the long term. This is the definition of a bias.
Why is this? Simply because of the agenda to cling onto a central meaningful place for anthropo CO2. Take the ensemble mean, reduce the slope T/time by 1/2 to better fit the long term real trend. Then go through the tedious trial and error process of reducing CO2’s effect and adjusting other factors (positive and negative feedbacks).
From what we’ve seen over the last several years in failures, new research and extension of the empirical record, it surely means reducing CO2s effect even more than that required to fit the empirical record, increase negative feedbacks and decrease positive feedbacks. Work on the mega-fluctuations of natural variability to use as a base for hanging the rest of it on (there should be less resistance to this wise course these days). This will take some decades. Please let me hear a compelling argument AGAINST reducing CO2’s role. Even if the radiative physics were to support double what the IPCC says the climate sensitivity is, logic tells us with 100% certainty that there therefore has to be some very large negative feedbacks in the system. Clearly in the very long term, CO2’s effect must be largely neutralized by such feedbacks (cycles into ice ages and back).
Meanwhile, carry on experimentation and observation and, over time, break down the agencies producing the positive and negative feedbacks, add new findings. Had we not frozen the theory in 1988, we would not have wasted 25 yrs and would be further along with this task. As it is, even the consensus is grudgingly having to admit that we have to start over from scratch despite a generation of self congratulatory hoopla, awarding of thousands of PhDs for worthless studies, spending of trillions and presentation of prestigious medals and awards.

Pamela Gray
December 1, 2013 7:29 am

Mike Jonas comments: “Would someone like to translate “ENSEMBLES models”, “dynamic climatology empirical model”, and “static climatology” into plain English for me?”
That depends on who you talk to and what paper you read. The coinage of new labels is faster than the pace of the widening gap between projections and reality. It is a breathless race to keep up. Dynamical models are generally those that try to model the process the modeler thinks is driving a climate trend. Empirical (or statistical) models are those that have decades of actual pre-conditions and resultant climate affects that are then used to create a suite of model runs from a number of different pre-conditions that have actually happened in the past. There is generally a final component of CO2 increased temperature WAG that is tagged onto the calculations before the runs commence.
As for ensembles, there is this from AR4:
“Ensembles of models represent a new resource for studying the range of plausible climate responses to a given forcing. Such ensembles can be generated either by collecting results from a range of models from different modelling centres (‘multi-model ensembles’ as described above), or by generating multiple model versions within a particular model structure, by varying internal model parameters within plausible ranges (‘perturbed physics ensembles’).”

Pamela Gray
December 1, 2013 7:55 am

Cozy way to study climate isn’t it. Can you imagine this kind of work in agriculture? Model how a new grape variety will survive [hot/cold/wet/dry climate change]. Don’t bother with plots at all. Decide its worth the expense of marketing it based on your computerized results. Spend millions in advertising campaigns. Convince entire farming communities to plant such a crop with confidence. Wait for your ship to come in.

Pamela Gray
December 1, 2013 8:16 am

That is not to say modeling is useless in agriculture. It works pretty darn good. But the modelers had to put in some field work.
http://phys.org/news/2013-07-corn-yield-simple-specific-growth.html