Interpreting extreme climate impacts from large ensemble simulations—are they unseen or unrealistic?

Open Access Paper in Environmental Research Letters

T Kelder8,1, N Wanders2, K van der Wiel3, T I Marjoribanks4, L J Slater5, R l Wilby1 and C Prudhomme1,6,7

Published 29 March 2022 • © 2022 The Author(s). Published by IOP Publishing Ltd
Environmental Research LettersVolume 17Number 4

Citation T Kelder et al 2022 Environ. Res. Lett. 17 044052


Large-ensemble climate model simulations can provide deeper understanding of the characteristics and causes of extreme events than historical observations, due to their larger sample size. However, adequate evaluation of simulated ‘unseen’ events that are more extreme than those seen in historical records is complicated by observational uncertainties and natural variability. Consequently, conventional evaluation and correction methods cannot determine whether simulations outside observed variability are correct for the right physical reasons. Here, we introduce a three-step procedure to assess the realism of simulated extreme events based on the model properties (step 1), statistical features (step 2), and physical credibility of the extreme events (step 3). We illustrate these steps for a 2000 year Amazon monthly flood ensemble simulated by the global climate model EC-Earth and global hydrological model PCR-GLOBWB. EC-Earth and PCR-GLOBWB are adequate for large-scale catchments like the Amazon, and have simulated ‘unseen’ monthly floods far outside observed variability. We find that the realism of these simulations cannot be statistically explained. For example, there could be legitimate discrepancies between simulations and observations resulting from infrequent temporal compounding of multiple flood peaks, rarely seen in observations. Physical credibility checks are crucial to assessing their realism and show that the unseen Amazon monthly floods were generated by an unrealistic bias correction of precipitation. We conclude that there is high sensitivity of simulations outside observed variability to the bias correction method, and that physical credibility checks are crucial to understanding what is driving the simulated extreme events. Understanding the driving mechanisms of unseen events may guide future research by uncovering key climate model deficiencies. They may also play a vital role in helping decision makers to anticipate unseen impacts by detecting plausible drivers.

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.


Weather extremes such as floods, droughts, heatwaves and cyclones can have major societal impacts including mortality and morbidity (Gasparrini et al 2015, Raymond et al 2020), and economic damages (Felbermayr and Gröschl 2014, Klomp and Valckx 2014, Kousky 2014). Weather extremes can also increase inequality (Dell et al 2012, Hallegatte and Rozenberg 2017). In risk analyses, the full range of impacts that may arise from climate and weather extremes must be evaluated (Sutton 2019). For example, the credible maximum extreme event is important for risk estimates of potentially disruptive impacts (Wilby et al 2011), such as mortality, morbidity, and damage from floods in large river systems and from dam failures (e.g. Vano et al 2019), or for climate-related shocks to food security (Kent et al 2017). However, brevity and sparsity of historical records are well known constraints that confound likelihood estimation of extreme events (Alexander 2016, Wilby et al 2017). Climate model projections reduce this limitation but may not capture the full range of extreme events that can arise from climate variability when just a few ensemble members are used (Van der Wiel et al 2019b, Mankin et al 2020). However, large ensemble simulations from seasonal to multi-decadal prediction systems offer a solution to the estimation of rare events due to their multiple realizations (Allen 2003, van den Brink et al 2005, Thompson et al 2017, Van der Wiel et al 2019b, Mankin et al 2020, Brunner and Slater 2022).

Traditionally, large ensembles have been generated by stochastic weather generators trained on the historical record (e.g. Wilks and Wilby 1999, Brunner and Gilleland 2020). However, advances in super-computing and the physical realism of climate models have facilitated the exploitation of large ensemble simulations for the emulation of events with physically plausible drivers that have not yet been observed (Coumou and Rahmstorf 2012, Stevenson et al 2015, Stott et al 2016, Kent et al 2019, Thompson et al 2019, Deser et al 2020, Kay et al 2020, Swain et al 2020, Brunner and Slater 2022). Following Thompson et al (2017), we define the use of large ensemble simulations to estimate ‘unseen’ events more severe than those seen in the historical record as the Unprecedented Simulated Extremes using Ensembles (UNSEEN) approach.

One drawback of using model simulations is that biases are likely to exist, which may occasionally produce unrealistic extreme events. Many techniques have been developed to uncover potential systematic climate model biases (Eyring et al 20162019), compare simulated extreme indices with observations (Weigel et al 2021), and to evaluate the consistency between simulated and observed distributions of extreme events (Thompson et al 20172019, Kelder et al 2020, Suarez-Gutierrez et al 2021). However, none of these procedures can determine whether the models are correct for the right physical reasons.

Bias correction (or data adjustment) methods are widely used to reduce model discrepancies, especially when coupling climate model simulations with impact models (Warszawski et al 2014), but do not necessarily correct the simulations for the right physical reasons (Maraun et al 2017). For example, a mismatch between simulations and observations may be caused by observational uncertainties and natural variability, rather than by model biases (Addor and Fischer 2015, Casanueva et al 2020). Existing evaluation and correction methods are thus not designed for simulated unseen events. As a consequence, large ensemble simulations with extreme events outside the range of observed variability raise an important question: to what extent can such outliers be trusted? Are the events unseen or unrealistic?

In this paper, we demonstrate a framework to check that the conclusions about unseen events obtained from large ensemble analyses are sound. Our three steps for assessing the realism of simulated events outside the range of observed variability (figure 1) are inspired by the protocol for event attribution to climate change (Philip et al 2020). Step 1 is to review model properties and assess whether the system representation has the capability to represent relevant processes leading to extreme events. Step 2 is to evaluate the statistical features of the large ensemble of simulations (whether from global climate models or regional climate models) by evaluating the consistency of simulated distributions with observations. Bias correction is an integral part of assessing statistical features because it is common practice (e.g. Warszawski et al 2014) but may influence the simulated distribution of extreme events and impacts. We, therefore, evaluate the statistical features for both raw and bias corrected values. Step 3 is to assess the physical credibility of the model simulations. Although some studies check the physical processes leading to extreme events—such as teleconnections and land–atmosphere interactions (Van der Wiel et al 2017, Thompson et al 2019, Vautard et al 2019, Kay et al 2020)—establishing physical credibility is not straightforward (Philip et al 2020), especially for unseen events.

Figure 1. A three-step procedure for evaluating the realism of large ensemble simulations lying outside observed variability. Step 1 is to assess whether the model properties are fit for purpose. Step 2 is to statistically evaluate the simulations, then apply bias correction as required. Step 3 is to evaluate the credibility of the processes within the models leading to the simulation of an unseen event. The orange colour gradient indicates the increasing confidence in the simulation of unseen events throughout the framework.
Download figure: Standard image
High-resolution image

We demonstrate our framework using a case study of Amazon floods. In 2009 and 2012, floods in the Amazon led to the spread of disease, food, and water insecurity (Davidson et al 2012, Hofmeijer et al 2013, Marengo and Espinoza 2016, Bauer et al 2018). At that time, the 2009 flood was the most extreme in 107 years of records, yet three years later it became the second highest in 110 years, drastically altering likelihood estimates. Despite the Amazon stage record being one of the longest in the world, the ∼100 year series is still too short for estimating credible, worst-case events.

To sample more flood events than those available from the historical record, we use EC-Earth large ensemble global climate model simulations coupled with the PCR-GLOBWB global hydrological (water balance) model from an earlier study (Van der Wiel et al 2019b). EC-Earth and PCR-GLOBWB are state-of-the-art global models that have been applied in numerous multi-model intercomparison studies, such as within the Coupled Model Intercomparison Project (e.g. Taylor et al 2012, Samaniego et al 2019, Wanders et al 2019), and have been validated globally (Hazeleger et al 2012, Sutanudjaja et al 2018), including for Amazon streamflow (van Schaik et al 2018). Here, we extend previous studies by evaluating whether simulated extremes that exceed the historical record are likely to be unseen events or simply unrealistic. We do this by: reviewing the ability of EC-Earth and PCR-GLOBWB to simulate extreme Amazon floods (Step 1); assessing the statistical consistency of these large ensemble simulations with observations using raw data or bias corrected simulations (Step 2) then; exploring the physical drivers behind the largest simulated floods (Step 3).

Read the full paper here.

2.3 7 votes
Article Rating
Newest Most Voted
Inline Feedbacks
View all comments
April 12, 2022 10:13 pm

The answer of the question in the title sits in this quote:

“To sample more flood events than those available from the historical record, WE USE EC-Earth large ensemble GLOBAL CLIMATE MODEL simulations ”

So extreme climate impacts from large ensemble simulations are UNREALISTIC..

Allan MacRae
Reply to  Hans Erren
April 13, 2022 12:13 am

Hi Hans,
I read the word salad in the above paper and thought: Whisky Tango Foxtrot?
Best regards, Allan 🙂

Reply to  Allan MacRae
April 13, 2022 4:26 am

I heard the word salad in the above and thought: Whisky Tango, what a great night out that was

Peta of Newark
April 12, 2022 10:26 pm

Lord forgive them – they know not what they do.

Geoff Sherrington
April 12, 2022 11:03 pm

Some of the uncertainty might be resolved in the paper “Effect of Time-Resolution of Rainfall Data on Trend Estimation for Annual Maximum Depths with a Duration of 24 Hours” by Prof. Dr. Renato Morbidelli et al.

Their Abstract begins –
“The main challenge of this paper is to demonstrate that one of the most frequently conducted analyses in the climate change field could be affected by significant errors, due to the use of rainfall data characterized by coarse time-resolution. In fact, in the scientific literature, there are many studies to verify the possible impacts of climate change on extreme rainfall, and particularly on annual maximum rainfall depths, Hd, characterized by duration d equal to 24 h, due to the significant length of the corresponding series. Typically, these studies do not specify the temporal aggregation, ta, of the rainfall data on which maxima rely, although it is well known that the use of rainfall data with coarse ta can lead to significant underestimates of Hd. The effect of ta on the estimation of trends in annual maximum depths with d = 24 h, Hd=24 h, over the last 100 years is examined.”
Geoff S

April 12, 2022 11:07 pm

There is a chance there is some good in what they are doing but I’ll be damned if I can see it.

Reply to  Bob
April 13, 2022 12:53 am

It is untestable unless you had a time machine and hence BS.

Ireneusz Palmowski
April 12, 2022 11:21 pm

The troposphere in winter in high latitudes only reaches to about 6 km. Therefore, changes in the stratosphere due to a decrease in solar activity will affect winter temperatures in high and mid latitudes very quickly.comment imagecomment image

Ireneusz Palmowski
April 12, 2022 11:28 pm

 Because the increase in solar wind strength is highly variable in the 25th solar cycle, no El Niño can form.comment image

Ireneusz Palmowski
April 12, 2022 11:32 pm

Weak La Niña may make it to the end of 2022.comment image

Ireneusz Palmowski
April 12, 2022 11:36 pm

High SOI will drive typhoons in the western Pacific.

Climate believer
April 13, 2022 12:10 am

Word salad… very soporific.

Randy Stubbings
April 13, 2022 12:21 am

However, advances in super-computing and the physical realism of climate models have facilitated the exploitation of large ensemble simulations for the emulation of events with physically plausible drivers that have not yet been observed.”

So climate models that simulate events that have never been observed in the physical world exhibit physical realism???

“We demonstrate a framework to check that the conclusions about unseen events obtained from large ensemble analyses are sound. Our three steps … are inspired by the protocol for event attribution to climate change.”

The same attribution framework that is discussed here? The IPCC’s attribution methodology is fundamentally flawed – Watts Up With That?

Circular logic at its best. The good news is that these guys are not designing bridges or tall buildings.

Reply to  Randy Stubbings
April 13, 2022 6:50 am

No, but they are attempting to re-engineer the entire energy infrastructure of the west based only on video games that consistently run way too hot.

April 13, 2022 12:47 am

All the money going to climate doom saying attracts many petitioners. Many of these poor souls are put to considerable toil to conceive of ways to make themselves worthy of doomsayer status and benefits. The more normal channels are now so crowed that newcomers must invent the most imaginative possible ways to blur the distinctions between reality and fantasy lands. Probably growing up with past times such as Dungeons & Dragons is a great help in their chosen adult fields.

April 13, 2022 1:07 am

Modelling past floods of large river systems is just not possible.
1. The changes to run off coefficient ( how impervious of catchment has changed) is not known.
2. The change to river cross section, length and roughness is not known.
3. Partial area effect is not known. Absolutely critical for large catchments.
4. Proportion of snow melt is not known and is usually independent of rainfall.

The unknown impacts of any of the above are just about always greater than changes to extreme rainfall.

Chris Hanley
April 13, 2022 1:09 am

Computer models are not time machines.

Ben Vorlich
April 13, 2022 1:17 am

Aren’t we told two things
Trees prevent floods
The Amazon is suffering deforestation in an area the size of Wales every month

Therefore floods by the Amazon should be increasing in frequency and severity

Sebastian Magee
April 13, 2022 2:07 am

Does this crap pass as science, nowadays? What a lengthy piece on climate astrology.

Ireneusz Palmowski
April 13, 2022 2:45 am
Old Man Winter
Reply to  Ireneusz Palmowski
April 13, 2022 5:41 am

They need the precipitation as they were in a drought last year. Because it’s
snow, there’ll be a lot less runoff & most of that will soak into the ground.

Ireneusz Palmowski
Reply to  Old Man Winter
April 13, 2022 6:19 am

It’s not just a snowstorm in North Dakota, but large masses of Arctic air that are reaching the western US.comment image

Ireneusz Palmowski
Reply to  Old Man Winter
April 13, 2022 6:24 am

Look at the circulation in the lower stratosphere. This is the direction of the jet currents.comment image

Bill S
April 13, 2022 3:26 am

“ For example, a mismatch between simulations and observations may be caused by observational uncertainties and natural variability, rather than by model biases”

In other words, our model was right, reality was wrong. This stands Richard Feynman’s definition of science on its head.

“It doesnt’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong. In that simple statement is the key to science.” – Richard Feynman

Mark BLR
Reply to  Bill S
April 13, 2022 4:06 am

This stands Richard Feynman’s definition of science on its head.


It also commits a variant of the “false dichotomy” logical fallacy.

It ignores the possibility that “a mismatch between simulations and observations” could be the result of a combination of “observational uncertainties”, “natural variability” and “model biases”.

Dave Fair
Reply to  Bill S
April 13, 2022 10:11 am


Mark BLR
April 13, 2022 4:01 am

Money quote 1 : “These questions are intentionally phrased to test whether the ‘null hypothesis’ (that the model is adequate) can be rejected rather than prove that it is true.”

They start by inverting the “normal” version of The Scientific Method (TSM), i.e. that
1) The null hypothesis = The new conjecture is wrong, and that
2) TSM doesn’t do “proof”, it only deals with “likelihoods” and “probabilities”. “Proof” is reserved for mathematics and alcohol.

Everything that follows is suspect.

– – – – –

Money quote 2 : “The physical realism must be checked, which, in this case, showed that the largest simulated monthly flood was an artefact of a bias correction mechanism.”

Even after rigging the system, they were still obliged to conclude, paraphrasing only slightly, that “the combination of models used was rubbish”.

Mickey Reno
April 13, 2022 4:09 am

Due to their larger sample size?

Earth to Double Dumb Asses: you have a SAMPLE SIZE of ZERO!

Reply to  Mickey Reno
April 13, 2022 5:25 am

I was about to make exactly the same comment.

Since you beat me to making that point, let me add that Environmental Research Letters (claims it) is a quarterly, peer-reviewed journal ( so there’s more “dumb asses” than meet the eye.

oeman 50
Reply to  Mickey Reno
April 13, 2022 6:32 am

Ha, my thoughts as well!

“…deeper understanding of the characteristics and causes of extreme events than historical observations, due to their larger sample size.”

Computer simulations are not samples at all!

David Dibbell
April 13, 2022 5:03 am

“Physical credibility checks”

Watch the Amazon in this animation from NOAA to quickly evaluate the credibility of any claim of diagnostic authority from large-grid, parameterized, step-iterated simulations.

This is from the GOES-16 imager. 2 km resolution, 120 images x 10-minute intervals. Band 16. There may be some missing images but the point is the same. There is a physical imager, sensing physical radiance values, from which these visualizations are composed at high resolution. The physical operation of the atmosphere is readily apparent.

Physical credibility checked. None found.

Andrew Wilkins
April 13, 2022 5:33 am

“Just because you haven’t seen fairies at the bottom of your garden doesn’t mean they don’t exist. Our model says they exist”
Absolute twaddle.

April 13, 2022 5:34 am

NO, NO, NO, NO, NO. You cannot use model output adjust the data. If the measured data is uncertain or sparse, you CANNOT use models to fix the data or “add” data. This is a true abomination.

If the data is bad, you need to get more, better data, not “adjust it” to fit the model.


Old Man Winter
Reply to  Rxc
April 13, 2022 6:09 am

Spot on. They can get more & better data by using proxies from further back
in the past to see what the extremes were over a much greater time period.
The results would show real extremes that may/may not have occurred vs
model outputs that can only reflect what you think may have occurred.
With the heavy bias toward extreme events, the data will be more honest.
But that wouldn’t be as much “fun”!

Last edited 1 month ago by Old Man Winter
Jim Gorman
Reply to  Rxc
April 13, 2022 8:39 am

Just like homogenization of temperatures. Instead of stopping a data record and starting a new one it is considered better to create NEW INFORMATION to replace past data so a LONG RECORD (hardy har har) can be retained. Or “creating” a new temperature from climates 1200 km away.

Andrew Wilkins
April 13, 2022 5:34 am

What has happened to the world of science when people are actually paid to produce this rubbish?

April 13, 2022 6:10 am

Never mind climate change….scientists say the Universe is heating up….the Great Expansion of the Universe was expected to produce cooling but now gravity including dark matter is believed to be causing warming…and that means more cosmic radiation will reach earth…so climate change is trivial….just forgetaboutit.

Kevin kilty
April 13, 2022 6:18 am

So, what are we really concerned with? The 1,000 year event? I.e. 10 times the span of observed weather events. Twenty times? A ten thousand year event? By what criteria do we establish a need for accurate yearly probability of such extreme events? Do the insurance companies need this info? Is some critical facility so expensive to replace that we need accurate assessment of tiny probabilities? If so, why don’t we worry about asteroid strikes? Does FEMA think they need it? Or is this just some fantasy need of the climate change industry?

How about we go on a campaign to establish the greatest of floods in the best preserved geology of selected river basins, and compare this data to the computer simulations, rather than use analyses and simulations to establish the credibility of simulations? I mean let’s use the Earth itself as the computer. And in places where the time span of credible geological inferences are still too short for the perceived need, let’s just use a 99% probability rather than 97.5%, or 99.9% rather than 99% or whatever.

Meanwhile get people to stop building critical facilities in known flood plains, or too close to shorelines, if the actual goal here is to preserve wealth from being destroyed.

Last edited 1 month ago by Kevin kilty
David Dibbell
Reply to  Kevin kilty
April 13, 2022 9:21 am

I mean let’s use the Earth itself as the computer.” Yes. As in my comment above about the implications of the images from the geostationary satellites. The physical credibility of the low-resolution computed “climate” is very poor.

Pat Frank
April 13, 2022 7:05 am

Large-ensemble climate model simulations can provide deeper understanding of the characteristics and causes of extreme events than historical observations,…”

One can only laugh resignedly. These people are so lost they don’t know they’re lost.

Climate models can reveal nothing about extreme weather events. Thinking they do is like supposing that studying the simulated elevator explosion in The Matrix will provide a deeper understanding of real explosions, “due to their larger sample size.” In-bloody-credible.

Hacks have conquered the world. The professions are captive of fakes. Of poseurs.

Curious George(@moudryj)
Reply to  Pat Frank
April 13, 2022 7:40 am

Why the “large ensemble”? Because we don’t have one good climate model. The hope is that the models, each one with a different error, would cluster around the real thing. Unfortunately, they seem to cluster around a central error – they all run hot.

Tom Gasloli
April 13, 2022 7:33 am

That these people believe they are scientists is deeply disturbing.

Walter Sobchak
April 13, 2022 8:49 am

Video games. Who cares?

Dave Fair
Reply to  Walter Sobchak
April 13, 2022 10:19 am

I care: They are ruining my society, economy and energy systems.

Reply to  Walter Sobchak
April 13, 2022 10:29 am

I care because I don’t want a couple of pilots using a couple of climates on their I-phones to generate the weather picture, at night with the lightning flashing and gusts of rain are pounding the plane.

Richard M
April 13, 2022 9:30 am

These models are based on incorrect science and can hardly be expected to produce anything close to valid results. CO2 emissions are part of a complex process within the atmosphere that leads to slightly more precipitation as CO2 levels rise.

This was described by Dr. William Gray a decade ago

He explains how increases in CO2 lead to increased convection which drives water vapor higher into the atmosphere where it is colder. This reduces the concentration precisely where it has a strong greenhouse effect thus countering a small warming effect also high in the atmosphere from increased CO2.

This is actually visible in the NOAA data collected over the past 60 years.

CO2 emissions lead to higher evaporation rates, due to CO2 having a weak IR signature, which is the basis for this feedback process.

Dr. Ferenc Miskolczi described the science in his 2007, 2010 and 2014 peer reviewed papers. The clear sky opacity, a measure of the greenhouse effect, essentially stays nearly constant. We will see a small increase in precipitation as CO2 levels increase. Exactly what is needed to drive increases in biosphere productivity.

April 13, 2022 9:49 am

Scientists of the past were too incompetent to read a thermometer properly, so we need to recalculate for them.

Now the people living on the banks of the Amazon river are too incompetent to know that they died in an ‘unseen’ flood predicted by the climate models.

Andy Pattullo
April 13, 2022 9:56 am

“model simulations can provide deeper understanding”

Well there’s your problem. Like learning about physics from the Sports Illustrated Swimsuit edition.

Ireneusz Palmowski
April 13, 2022 1:40 pm

Bitter cold to persist in wake of massive blizzard in north-central US.comment image?w=632

Mike Smith
April 13, 2022 6:19 pm

Translation: We need data that we don’t have. Therefore, we will make it up and then argue that our guesses were really pretty good.

The antithesis of science.

Dave Miller
April 14, 2022 8:31 am

Stopped at the first period.


Ireneusz Palmowski
April 14, 2022 1:00 pm

Snow totals top 40 inches as April blizzard blasts northern US
Hundreds of miles of roads were shut down, ranchers and their animals faced brutal conditions — and will see more tough weather ahead. Meanwhile, snowdrifts in some spots were estimated to be higher than 10 feet in some places.

%d bloggers like this: