Citation T Kelder et al 2022 Environ. Res. Lett. 17 044052
Large-ensemble climate model simulations can provide deeper understanding of the characteristics and causes of extreme events than historical observations, due to their larger sample size. However, adequate evaluation of simulated ‘unseen’ events that are more extreme than those seen in historical records is complicated by observational uncertainties and natural variability. Consequently, conventional evaluation and correction methods cannot determine whether simulations outside observed variability are correct for the right physical reasons. Here, we introduce a three-step procedure to assess the realism of simulated extreme events based on the model properties (step 1), statistical features (step 2), and physical credibility of the extreme events (step 3). We illustrate these steps for a 2000 year Amazon monthly flood ensemble simulated by the global climate model EC-Earth and global hydrological model PCR-GLOBWB. EC-Earth and PCR-GLOBWB are adequate for large-scale catchments like the Amazon, and have simulated ‘unseen’ monthly floods far outside observed variability. We find that the realism of these simulations cannot be statistically explained. For example, there could be legitimate discrepancies between simulations and observations resulting from infrequent temporal compounding of multiple flood peaks, rarely seen in observations. Physical credibility checks are crucial to assessing their realism and show that the unseen Amazon monthly floods were generated by an unrealistic bias correction of precipitation. We conclude that there is high sensitivity of simulations outside observed variability to the bias correction method, and that physical credibility checks are crucial to understanding what is driving the simulated extreme events. Understanding the driving mechanisms of unseen events may guide future research by uncovering key climate model deficiencies. They may also play a vital role in helping decision makers to anticipate unseen impacts by detecting plausible drivers.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Weather extremes such as floods, droughts, heatwaves and cyclones can have major societal impacts including mortality and morbidity (Gasparrini et al 2015, Raymond et al 2020), and economic damages (Felbermayr and Gröschl 2014, Klomp and Valckx 2014, Kousky 2014). Weather extremes can also increase inequality (Dell et al 2012, Hallegatte and Rozenberg 2017). In risk analyses, the full range of impacts that may arise from climate and weather extremes must be evaluated (Sutton 2019). For example, the credible maximum extreme event is important for risk estimates of potentially disruptive impacts (Wilby et al 2011), such as mortality, morbidity, and damage from floods in large river systems and from dam failures (e.g. Vano et al 2019), or for climate-related shocks to food security (Kent et al 2017). However, brevity and sparsity of historical records are well known constraints that confound likelihood estimation of extreme events (Alexander 2016, Wilby et al 2017). Climate model projections reduce this limitation but may not capture the full range of extreme events that can arise from climate variability when just a few ensemble members are used (Van der Wiel et al 2019b, Mankin et al 2020). However, large ensemble simulations from seasonal to multi-decadal prediction systems offer a solution to the estimation of rare events due to their multiple realizations (Allen 2003, van den Brink et al 2005, Thompson et al 2017, Van der Wiel et al 2019b, Mankin et al 2020, Brunner and Slater 2022).
Traditionally, large ensembles have been generated by stochastic weather generators trained on the historical record (e.g. Wilks and Wilby 1999, Brunner and Gilleland 2020). However, advances in super-computing and the physical realism of climate models have facilitated the exploitation of large ensemble simulations for the emulation of events with physically plausible drivers that have not yet been observed (Coumou and Rahmstorf 2012, Stevenson et al 2015, Stott et al 2016, Kent et al 2019, Thompson et al 2019, Deser et al 2020, Kay et al 2020, Swain et al 2020, Brunner and Slater 2022). Following Thompson et al (2017), we define the use of large ensemble simulations to estimate ‘unseen’ events more severe than those seen in the historical record as the Unprecedented Simulated Extremes using Ensembles (UNSEEN) approach.
One drawback of using model simulations is that biases are likely to exist, which may occasionally produce unrealistic extreme events. Many techniques have been developed to uncover potential systematic climate model biases (Eyring et al 2016, 2019), compare simulated extreme indices with observations (Weigel et al 2021), and to evaluate the consistency between simulated and observed distributions of extreme events (Thompson et al 2017, 2019, Kelder et al 2020, Suarez-Gutierrez et al 2021). However, none of these procedures can determine whether the models are correct for the right physical reasons.
Bias correction (or data adjustment) methods are widely used to reduce model discrepancies, especially when coupling climate model simulations with impact models (Warszawski et al 2014), but do not necessarily correct the simulations for the right physical reasons (Maraun et al 2017). For example, a mismatch between simulations and observations may be caused by observational uncertainties and natural variability, rather than by model biases (Addor and Fischer 2015, Casanueva et al 2020). Existing evaluation and correction methods are thus not designed for simulated unseen events. As a consequence, large ensemble simulations with extreme events outside the range of observed variability raise an important question: to what extent can such outliers be trusted? Are the events unseen or unrealistic?
In this paper, we demonstrate a framework to check that the conclusions about unseen events obtained from large ensemble analyses are sound. Our three steps for assessing the realism of simulated events outside the range of observed variability (figure 1) are inspired by the protocol for event attribution to climate change (Philip et al 2020). Step 1 is to review model properties and assess whether the system representation has the capability to represent relevant processes leading to extreme events. Step 2 is to evaluate the statistical features of the large ensemble of simulations (whether from global climate models or regional climate models) by evaluating the consistency of simulated distributions with observations. Bias correction is an integral part of assessing statistical features because it is common practice (e.g. Warszawski et al 2014) but may influence the simulated distribution of extreme events and impacts. We, therefore, evaluate the statistical features for both raw and bias corrected values. Step 3 is to assess the physical credibility of the model simulations. Although some studies check the physical processes leading to extreme events—such as teleconnections and land–atmosphere interactions (Van der Wiel et al 2017, Thompson et al 2019, Vautard et al 2019, Kay et al 2020)—establishing physical credibility is not straightforward (Philip et al 2020), especially for unseen events.
We demonstrate our framework using a case study of Amazon floods. In 2009 and 2012, floods in the Amazon led to the spread of disease, food, and water insecurity (Davidson et al 2012, Hofmeijer et al 2013, Marengo and Espinoza 2016, Bauer et al 2018). At that time, the 2009 flood was the most extreme in 107 years of records, yet three years later it became the second highest in 110 years, drastically altering likelihood estimates. Despite the Amazon stage record being one of the longest in the world, the ∼100 year series is still too short for estimating credible, worst-case events.
To sample more flood events than those available from the historical record, we use EC-Earth large ensemble global climate model simulations coupled with the PCR-GLOBWB global hydrological (water balance) model from an earlier study (Van der Wiel et al 2019b). EC-Earth and PCR-GLOBWB are state-of-the-art global models that have been applied in numerous multi-model intercomparison studies, such as within the Coupled Model Intercomparison Project (e.g. Taylor et al 2012, Samaniego et al 2019, Wanders et al 2019), and have been validated globally (Hazeleger et al 2012, Sutanudjaja et al 2018), including for Amazon streamflow (van Schaik et al 2018). Here, we extend previous studies by evaluating whether simulated extremes that exceed the historical record are likely to be unseen events or simply unrealistic. We do this by: reviewing the ability of EC-Earth and PCR-GLOBWB to simulate extreme Amazon floods (Step 1); assessing the statistical consistency of these large ensemble simulations with observations using raw data or bias corrected simulations (Step 2) then; exploring the physical drivers behind the largest simulated floods (Step 3).