New peer reviewed paper shows just how bad the climate models really are

One of the biggest, if not the biggest issues of climate science skepticism is the criticism of over-reliance on computer model projections to suggest future outcomes. In this paper, climate models were hindcast tested against actual surface observations, and found to be seriously lacking. Just have a look at Figure 12 (mean temperature -vs- models for the USA)  from the paper, shown below:

Fig. 12. Various temperature time series spatially integrated over the USA (mean annual), at annual and 30-year scales. Click image for the complete graph

The graph above shows temperature in the blue lines, and model runs in other colors. Not only are there no curve shape matches, temperature offsets are significant as well. In the study, they also looked at precipitation, which fared even worse in correlation. The bottom line: if the models do a poor job of hindcasting, why would they do any better in forecasting? This from the conclusion sums it up pretty well:

…we think that the most important question is not whether GCMs can produce credible estimates of future climate, but whether climate is at all predictable in deterministic terms.

Selected sections of the entire paper, from the Hydrological Sciences Journal is available online here as HTML, and  as PDF ~1.3MB are given below:

A comparison of local and aggregated climate model outputs with observed data

Anagnostopoulos, G. G. , Koutsoyiannis, D. , Christofides, A. , Efstratiadis, A. and Mamassis, N. ‘A comparison of local and aggregated climate model outputs with observed data’, Hydrological Sciences Journal, 55:7, 1094 – 1110

Abstract

We compare the output of various climate models to temperature and precipitation observations at 55 points around the globe. We also spatially aggregate model output and observations over the contiguous USA using data from 70 stations, and we perform comparison at several temporal scales, including a climatic (30-year) scale. Besides confirming the findings of a previous assessment study that model projections at point scale are poor, results show that the spatially integrated projections are also poor.

Citation Anagnostopoulos, G. G., Koutsoyiannis, D., Christofides, A., Efstratiadis, A. & Mamassis, N. (2010) A comparison of local and aggregated climate model outputs with observed data. Hydrol. Sci. J. 55(7), 1094-1110.

INTRODUCTION

According to the Intergovernmental Panel on Climate Change (IPCC), global circulation models (GCM) are able to “reproduce features of the past climates and climate changes” (Randall et al., 2007, p. 601). Here we test whether this is indeed the case. We examine how well several model outputs fit measured temperature and rainfall in many stations around the globe. We also integrate measurements and model outputs over a large part of a continent, the contiguous USA (the USA excluding islands and Alaska), and examine the extent to which models can reproduce the past climate there. We will be referring to this as “comparison at a large scale”.

This paper is a continuation and expansion of Koutsoyiannis et al. (2008). The differences are that (a) Koutsoyiannis et al. (2008) had tested only eight points, whereas here we test 55 points for each variable; (b) we examine more variables in addition to mean temperature and precipitation; and (c) we compare at a large scale in addition to point scale. The comparison methodology is presented in the next section.

While the study of Koutsoyiannis et al. (2008) was not challenged by any formal discussion papers, or any other peer-reviewed papers, criticism appeared in science blogs (e.g. Schmidt, 2008). Similar criticism has been received by two reviewers of the first draft of this paper, hereinafter referred to as critics. In both cases, it was only our methodology that was challenged and not our results. Therefore, after presenting the methodology below, we include a section “Justification of the methodology”, in which we discuss all the critical comments, and explain why we disagree and why we think that our methodology is appropriate. Following that, we present the results and offer some concluding remarks.

Here’s the models they tested:

Comparison at a large scale

We collected long time series of temperature and precipitation for 70 stations in the USA (five were also used in the comparison at the point basis). Again the data were downloaded from the web site of the Royal Netherlands Meteorological Institute (http://climexp.knmi.nl). The stations were selected so that they are geographically distributed throughout the contiguous USA. We selected this region because of the good coverage of data series satisfying the criteria discussed above. The stations selected are shown in Fig. 2 and are listed by Anagnostopoulos (2009, pp. 12-13). THSJ_A_513518_O_XML_IMAGES\THSJ_A_513518_O_F0002g.jpg

Fig. 2. Stations selected for areal integration and their contribution areas (Thiessen polygons).

In order to produce an areal time series we used the method of Thiessen polygons (also known as Voronoi cells), which assigns weights to each point measurement that are proportional to the area of influence; the weights are the “Thiessen coefficients”. The Thiessen polygons for the selected stations of the USA are shown in Fig. 2.

The annual average temperature of the contiguous USA was initially computed as the weighted average of the mean annual temperature at each station, using the station’s Thiessen coefficient as weight. The weighted average elevation of the stations (computed by multiplying the elevation of each station with the Thiessen coefficient) is Hm = 668.7 m and the average elevation of the contiguous USA (computed as the weighted average of the elevation of each state, using the area of each state as weight) is H = 746.8 m. By plotting the average temperature of each station against elevation and fitting a straight line, we determined a temperature gradient θ = -0.0038°C/m, which implies a correction of the annual average areal temperature θ(H – Hm) = -0.3°C.

The annual average precipitation of the contiguous USA was calculated simply as the weighted sum of the total annual precipitation at each station, using the station’s Thiessen coefficient as weight, without any other correction, since no significant correlation could be determined between elevation and precipitation for the specific time series examined.

We verified the resulting areal time series using data from other organizations. Two organizations provide areal data for the USA: the National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA). Both organizations have modified the original data by making several adjustments and using homogenization methods. The time series of the two organizations have noticeable differences, probably because they used different processing methods. The reason for calculating our own areal time series is that we wanted to avoid any comparisons with modified data. As shown in Fig. 3, the temperature time series we calculated with the method described above are almost identical to the time series of NOAA, whereas in precipitation there is an almost constant difference of 40 mm per year. THSJ_A_513518_O_XML_IMAGES\THSJ_A_513518_O_F0003g.jpg

Fig. 3. Comparison between areal (over the USA) time series of NOAA (downloaded from http://www.ncdc.noaa.gov/oa/climate/research/cag3/cag3.html) and areal time series derived through the Thiessen method; for (a) mean annual temperature (adjusted for elevation), and (b) annual precipitation.

Determining the areal time series from the climate model outputs is straightforward: we simply computed a weighted average of the time series of the grid points situated within the geographical boundaries of the contiguous USA. The influence area of each grid point is a rectangle whose “vertical” (perpendicular to the equator) side is (ϕ2 – ϕ1)/2 and its “horizontal” side is proportional to cosϕ, where ϕ is the latitude of each grid point, and ϕ2 and ϕ1 are the latitudes of the adjacent “horizontal” grid lines. The weights used were thus cosϕ(ϕ2 – ϕ1); where grid latitudes are evenly spaced, the weights are simply cosϕ.

CONCLUSIONS AND DISCUSSION

It is claimed that GCMs provide credible quantitative estimates of future climate change, particularly at continental scales and above. Examining the local performance of the models at 55 points, we found that local projections do not correlate well with observed measurements. Furthermore, we found that the correlation at a large spatial scale, i.e. the contiguous USA, is worse than at the local scale.

However, we think that the most important question is not whether GCMs can produce credible estimates of future climate, but whether climate is at all predictable in deterministic terms. Several publications, a typical example being Rial et al. (2004), point out the difficulties that the climate system complexity introduces when we attempt to make predictions. “Complexity” in this context usually refers to the fact that there are many parts comprising the system and many interactions among these parts. This observation is correct, but we take it a step further. We think that it is not merely a matter of high dimensionality, and that it can be misleading to assume that the uncertainty can be reduced if we analyse its “sources” as nonlinearities, feedbacks, thresholds, etc., and attempt to establish causality relationships. Koutsoyiannis (2010) created a toy model with simple, fully-known, deterministic dynamics, and with only two degrees of freedom (i.e. internal state variables or dimensions); but it exhibits extremely uncertain behaviour at all scales, including trends, fluctuations, and other features similar to those displayed by the climate. It does so with a constant external forcing, which means that there is no causality relationship between its state and the forcing. The fact that climate has many orders of magnitude more degrees of freedom certainly perplexes the situation further, but in the end it may be irrelevant; for, in the end, we do not have a predictable system hidden behind many layers of uncertainty which could be removed to some extent, but, rather, we have a system that is uncertain at its heart.

Do we have something better than GCMs when it comes to establishing policies for the future? Our answer is yes: we have stochastic approaches, and what is needed is a paradigm shift. We need to recognize the fact that the uncertainty is intrinsic, and shift our attention from reducing the uncertainty towards quantifying the uncertainty (see also Koutsoyiannis et al., 2009a). Obviously, in such a paradigm shift, stochastic descriptions of hydroclimatic processes should incorporate what is known about the driving physical mechanisms of the processes. Despite a common misconception of stochastics as black-box approaches whose blind use of data disregard the system dynamics, several celebrated examples, including statistical thermophysics and the modelling of turbulence, emphasize the opposite, i.e. the fact that stochastics is an indispensable, advanced and powerful part of physics. Other simpler examples (e.g. Koutsoyiannis, 2010) indicate how known deterministic dynamics can be fully incorporated in a stochastic framework and reconciled with the unavoidable emergence of uncertainty in predictions.

h/t to WUWT reader Don from Paradise

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
139 Comments
Inline Feedbacks
View all comments
December 6, 2010 6:33 am

Why such an offset for present? Why don’t initial conditions of all models match current?

Dave Springer
December 6, 2010 6:34 am

It’s widely acknowledged that regional climate prediction by GCMs is not skillful. The lower 48 states of the US comprises just 2% of the earth’s surface. That sounds pretty “regional” to me. The selection of a specific 2% of the earth’s surface to integrate seems to be the very definition of cherry picking.
I fail to see how this study tells us anything the GCM boffins haven’t already admitted. Coupled ocean-atmosphere GCMs are still under development and it is hoped that these will eventually demonstrate more skill in regional prediction.

latitude
December 6, 2010 6:38 am

Paul Callahan says:
December 6, 2010 at 5:53 am
==============================
My sentiments exactly Paul.
I’ve always been curious as to what kind of person would believe that we know
enough about weather, to model climate, in the first place….

Tim Clark
December 6, 2010 6:39 am

We think that it is not merely a matter of high dimensionality, and that it can be misleading to assume that the uncertainty can be reduced if we analyse its “sources” as nonlinearities, feedbacks, thresholds, etc., and attempt to establish causality relationships.
Money Quote. Can’t say it plainer.

John Brookes
December 6, 2010 6:39 am

OK. There are some systems which are complex and which we do understand.
If you add a little bit of extra sunshine each day, and if the sun gets a bit higher in the sky each day, then over time you might expect the weather to get warmer. And this happens every year, as we move out of winter into summer. Is the system complex? Well, yes it is. Is it predictable? Well, yes and no. January in Perth Western Australia is hotter than July – every year. Can you say that the maximum on December 8th each year will be 28.7 degrees? No, of course you can’t.
Another interesting thing is that Perth has hot dry summers, while Sydney had warm humid summers, even though they have roughly the same latitude. So in the face of similar forcings, the response is different – but it is warming in both cases.
To summarise, you can reasonably expect to make sensible predictions of trends, even in a complex system. That an attempt to model the whole world doesn’t happen to correctly predict the few percent of the globe which happens to be continental USA is not really a problem.

anna v
December 6, 2010 6:53 am

stephen richards says:
December 6, 2010 at 5:07 am

In weather terms you will here them say that there is “some uncertanty in the forecast. Therefore, the significant breakdown can be relatively quick (within 12 hours) or relatively slow (within 3 days). Ilike the idea of an analogue computer but the same problems apply. The most important player in our ‘climate’ system ‘, as opposed to weather, is the ENSO and that cannot be predicted either in time or magnitude. Therefore, a viable climate model will almost certainly remain a mythical beast.

Have a look at the alternative neural net climate model of Tsonis et al , particularly the publication that uses not only ENSO but also PDO etc. and predicts cooling the next 30 years.
The problem set by prof Koutsogiannis , that climate may be deterministic in part and stochastic in another part is real, but if one could have an analogue computer where all the coupled differential equations were simulated, the deterministic part could be simulated, the way Tsonis does with neural nets and the level of currents. The remaining stochastic, real gausian randomness, could be simulated with the errors or the parameters entering in the simulation. Studying the output of such models would indicate if there are attractors, that would help gauging probabilities of climate paths .
The GCMs cannot yield probabilities because they do not have propagation of errors, thus there is no gauge about the probability of finding the different paths. Changing parameters by hand does not simulate chaos, as sometimes climate modelers claim. The spaghetti graphs and the ensembles of models demonstrate only the chaos in the thought processes of the modelers ensemble.

Dave Springer
December 6, 2010 7:20 am

GCMs aren’t crap but it’s painfully obvious they lack the skill reaonably needed upon which to base major policy decisions. They are a work in progress. The two major areas in need of more work are adequately modeling the water cycle and figuring out exactly what influence solar magnetic field changes have on high altitude cloud formation.
The earth’s albedo is represented in these models as a constant with the value for the constant selected to obtain the most consistent hindcast. The selected value varies between models by up to 7% in the range of 30-40%. A 1% difference in albedo is more than the forcing of all anthropogenic greenhouse gases combined which provides some perspective on how important it is to get albedo modeling done right.
It is exceedingly difficult to measure earth’s average albedo. Various recent attempts by different means are not in satisfactory agreement with each other but they are all in agreement on one thing – the earth’s albedo is NOT constant and in just the course of a decade has been seen to vary by one percent.
Another confounding factor is exactly where the “average global temperature” is being measured and where it is predicted. Model outputs are being compared to thermometers approximately 4 feet above the surface. Just the other morning at dawn when the nightly low temperature was at its lowest my thermometer at a height of 4 feet indicated 42 degrees F yet there was frost on my roof where that roof had a clear view of the sky and no frost where the view directly overhead was obstructed. That’s called hoarfrost and it is the result of radiative cooling – the clear night sky was literally sucking heat from the roof surface where that surface could “see” the sky directly above and where the view was blocked the roof temperature was the same as the air temperature. The difference was over 10F simply due to the radiative path being obstructed or not. So if you were a bird in a tree it was a not terribly cold 42F but if you were a tomato seedling an inch above the ground you froze would have frozen to death. Altitude makes a big difference.

This raises the point about how evaporation and convection of water vapor swiftly, efficiently, and undetectably (to a thermometer) removes heat from the surface and transports it thousand meters more or less above the surface where upon condensation it is released as sensible heat that a thermometer can measure. Who really cares if it’s getting warmer a thousand feet above the surface. We don’t live in the cloud layer. We only care about the temperature very near the surface where we and other living things actually spend our lives.

Nuke
December 6, 2010 7:36 am

english2016 says:
December 5, 2010 at 10:44 pm
At what point does Al Gore get charged with perpetrating a fraud, and the nobel prize withdrawn?

Remember, Gore won the Nobel Peace Prize. Unlike Nobel prizes in science or medicine, a Nobel Peace Price does not have any accomplishments as a prerequisite. How many Nobel Peace Prizes have been given for the Middle East? Do we have peace in the Middle East?
Maybe we can start by requesting Carter, Gore and Obama voluntarily give their Nobel Peace Prizes back?

anna v
December 6, 2010 7:44 am

John Brookes says:
December 6, 2010 at 6:39 am
That an attempt to model the whole world doesn’t happen to correctly predict the few percent of the globe which happens to be continental USA is not really a problem.
The problem is not with the USA area. It is with the whole world. The USA area has enough measurements to give statistically significant estimates of the deviation. Their first publication which has the same results had 8 locations all over the world.
In their conclusion at the time:
At the annual and the climatic (30-year) scales, GCM interpolated series are irrelevant to reality. GCMs do not reproduce natural over-year fluctuations and, generally, underestimate the variance and the Hurst coefficient of the observed series. Even worse, when the GCM time series imply a Hurst coefficient greater than 0.5, this results from a monotonic trend, whereas in historical data the high values of the Hurst coefficient are a result of large-scale over-year fluctuations
(i.e. successions of upward and downward “trends”). The huge negative values of coefficients of efficiency show that model predictions are much poorer than an elementary prediction based on the time average.

Tim Clark
December 6, 2010 7:47 am

Trenberth (2010) soberly assesses the transient deficiency related to model improvement: “Adding complexity to a modelled system when the real system is complex is no doubt essential for model development. It may, however, run the risk of turning a useful model into a research tool that is not yet skilful at making predictions.”
Another money quote. To paraphrase. If the model is as complex as the real system, we won’t be able to make our predictions.
Doh!

H.R.
December 6, 2010 7:49 am

The thing I could never figure is how one could model something that never exactly repeats. The earth’s climate has never been and will never be exactly the same from the earth’s inception to it’s eventual destruction. We do see some cyclical patterns from the geological record, but there’s no guarantee that a particular pattern will ever repeat once a new pattern emerges. Perhaps we might be able to get a handle on the climate on multi-millenial scales someday, but even then, what’s not to say that eventually a new pattern will emerge and we’d have to start over nearly from scratch?

Larry
December 6, 2010 7:51 am

I always thought the definition of a chaotic system was one that could not be modelled.
It seems to me that unchecked computer models have become the de facto way of overcoming common sense. It was pretty obvious a long time ago that the financial models had a problem, and rather than requiring simple checks – house price to earnings ratios the models were accepted at face value. As of itself a computer model is evidence of nothing but the idealised system it models.
It does make you wonder about the nuclear deterrents. New explosions were banned and replaced by computer simulations. Are those models valid? Good computer models are improving our world – but only when somebody is betting on it with their own money.

Vorlath
December 6, 2010 8:03 am

It’s like climate scientists forget the three basic rules of computing. These rules are great because you don’t need to know anything about the topic to recognize them when they happen.
1. If the input data is wrong, the ouput will be wrong. Otherwise known as garbage in, garbage out.
2. If the algorithm (the model) is wrong, the output will be wrong (even with the right input).
3. If you can’t predict the past, you can’t predict the future.
These are universal. So when someone asks if you’re an expert in the field, they just lost the argument when one of these points apply.

James Sexton
December 6, 2010 8:31 am

Well, that’s 2 recent papers that makes any GCM programmer look entirely silly. For some reason, while reading the study, I had a particular Eagles song blaring in my mind.
She(modelers) did exactly what her daddy (alarmist scientists) had planned.
She was perfect little sister until somebody missed
her and they found her in the bushes with
the boys in the band ..

Latimer Alder
December 6, 2010 8:32 am

springer
‘GCMs aren’t crap but it’s painfully obvious they lack the skill reaonably needed upon which to base major policy decisions. They are a work in progress’
H’mmm
Given that there has been thirty years development and zillions of dollars spent, wouldn’t it just be sensible to say ‘enough is enough’, file the project in the bin marked ‘too difficult’ and spend the money on something that does have some practical value to real people living today…not putative people who may or may not be living tomorrow when the climate may or may not be a tad warmer.
There is no point is throwing good money after bad.

pat
December 6, 2010 9:28 am

Fascinating. And note that the prediction of past temperatures are always higher, as if the models themselves are biased in that direction. And this after the utilization of ‘homogenized’ historical temperatures which in and of itself is biased.

Edward Bancroft
December 6, 2010 9:34 am

“However, we think that the most important question is not whether GCMs can produce credible estimates of future climate, but whether climate is at all predictable in deterministic terms.”
This is the crux of the argument for using models as a basis for major policy decisions. A deterministic model is one where all of the relationships and feedback structures between the influencing factors are known, and the target problem is bounded.
If in fact there are only approximations to the rules in the model, or the input data has measurement errors, or state disturbing random one-off events can occur, the model predictions are not likely to match the subsequently observed data. Basing major environmental economic policy on such models would, in a sensible world, be considered too high a level of risk.
If we are not able to model all factors and input all correct state data due to the complexity of the targeted physical processes, we will only have a training device, which is of some value in determining sensitivities and general understanding. However, if it can be shown that this useful training model will not match past known target system performance, we should not be relying to any extent on the model whatsoever for serious policy making.

December 6, 2010 9:58 am

To paraphrase Dave Springer, “Never trust a wet (damp) thermometer; here in Nashville, I’m watching the snowflakes play with the wind. Hide and Seek, I think.
It’s roughly 27F out there.

Chris B
December 6, 2010 10:16 am

A lot more pleasant than any CAGW GCM.

December 6, 2010 10:40 am

I am afraid the basic flaw of GCMs is, that they operate with false premise about radiative forcing creating that mythical 33K number. The rest is just computing power, creating the copy of CO2 curve.

FrankK
December 6, 2010 10:43 am

Lew Skannen says:
It is the giant thermostat in the sky that they are currently trying to appease by sacrificing dollars in Cancun…
———————————————————————————————————–
I really like that image. Yes have the AGW league progressed much further than native bushmen in the climate wilderness ?

George E. Smith
December 6, 2010 10:50 am

Well this may be a dumb question; in fact it is almost certain to be; but forgive me, because I haven’t picked myself up off the floor yet from laughing my A*** off. So the Max Planck chaps, use a 1.9 x 1.9 global degree grid in their models; so from henceforth, I am not going to worry any more about Optical Mouse digital cameras only having 32 x 32 pixel sensors; well some are only 15 x 15; well you have to give a little back if you want 10,000 picture frames per second.
So my dumb question:- For those 96 x 192 grids (and the other grids), HOW MANY of those grid points actually hava a MEASURING THERMOMETER located there to gather the input data.
Well it seems like a fourth grade science question; if you want to draw an accurate map of something; that you should actually make measurements of just where places really are. Biking around Central Park will hardly give you the information you need to draw an accurate map of Manhattan Island.
Now just compare models Max and the boys have 96 x 192 grid points, and they don’t actually have a thermometer at any of them (or do they); but then there is Mother Gaia’s model; and she actually has a themometer in every single atom or molecule; and somehow, Mother Gaia always gets the weather and climate correct; it always is exactly what her model predicts it will be; specially the Temperature.
See I told you it was a dumb question; maybe MG actually read the book on the Nyquist Sampling Theorem.
Izzere actually somewhere a brochure or document or whatever on any of these models, that says exactly what Physical variables, and parameters they use, and what equations they use to calculate all this stuff.
I presume that they at least accurately track the sun’s position , on the earth surface, where it intersects the line joining the sun, and earth centers, continuously as the earth rotates, for at least the whole thrity years of one IPCC approved climate period.
That would seem to be necessary to figure out just where solar flux falls on the surface; and whether it lands in the water, or on the land; because different things would happen depending on that question.
Well if somebody has a paper that lists ALL of the variables, and equations, and parameters; this place (WUWT) would be a good place to show that; well are there copyright issues; or si this stuff some sort of state secret. Does Peter Humbug have his own secret formula ?

Paul Vaughan
December 6, 2010 11:18 am

Untenable assumptions cannot be tolerated. The defeatist stochastic approach is dangerous to both society & civilization and will only be advocated by the lazy and those wishing to maintain blinders over the eyes of the masses. There may arise a time in the future when adding stochastic components will be the last remaining sensible thing to do, but sophistication of climate data exploration has [to date] been absolutely inadequate and thus climate science remains mired in the pit of Simpson’s Paradox. The era of meaningful statistical inference & modeling will necessarily follow an era of careful data exploration.

Darkinbad the Brightdayler
December 6, 2010 11:43 am

I’m astonished that only one model seems to be presented and then held on to.
From a mathematical and statistical point of view it makes more sense to examine a number of possible models in parallel.
But then, the purpose of a model is to hold up a mirror to the actual events observed and to discuss how accurately it fits the events it decribes.
It is in the examination of the differences that we seek to refine our understanding.
So the fact that a model doesn’t fit is not, in itself a failing.
Perhaps hanging on to one and not being willing to explore divergences is.

anna v
December 6, 2010 12:01 pm

Darkinbad the Brightdayler says:
December 6, 2010 at 11:43 am
I’m astonished that only one model seems to be presented and then held on to.
What one model? Have a look at the table and see there are six GCMs considered in the comparison.
Maybe you are on a wrong thread?