One of the biggest, if not the biggest issues of climate science skepticism is the criticism of over-reliance on computer model projections to suggest future outcomes. In this paper, climate models were hindcast tested against actual surface observations, and found to be seriously lacking. Just have a look at Figure 12 (mean temperature -vs- models for the USA) from the paper, shown below:

The graph above shows temperature in the blue lines, and model runs in other colors. Not only are there no curve shape matches, temperature offsets are significant as well. In the study, they also looked at precipitation, which fared even worse in correlation. The bottom line: if the models do a poor job of hindcasting, why would they do any better in forecasting? This from the conclusion sums it up pretty well:
…we think that the most important question is not whether GCMs can produce credible estimates of future climate, but whether climate is at all predictable in deterministic terms.
Selected sections of the entire paper, from the Hydrological Sciences Journal is available online here as HTML, and as PDF ~1.3MB are given below:
A comparison of local and aggregated climate model outputs with observed data
Anagnostopoulos, G. G. , Koutsoyiannis, D. , Christofides, A. , Efstratiadis, A. and Mamassis, N. ‘A comparison of local and aggregated climate model outputs with observed data’, Hydrological Sciences Journal, 55:7, 1094 – 1110
Abstract
We compare the output of various climate models to temperature and precipitation observations at 55 points around the globe. We also spatially aggregate model output and observations over the contiguous USA using data from 70 stations, and we perform comparison at several temporal scales, including a climatic (30-year) scale. Besides confirming the findings of a previous assessment study that model projections at point scale are poor, results show that the spatially integrated projections are also poor.
Citation Anagnostopoulos, G. G., Koutsoyiannis, D., Christofides, A., Efstratiadis, A. & Mamassis, N. (2010) A comparison of local and aggregated climate model outputs with observed data. Hydrol. Sci. J. 55(7), 1094-1110.
According to the Intergovernmental Panel on Climate Change (IPCC), global circulation models (GCM) are able to “reproduce features of the past climates and climate changes” (Randall et al., 2007, p. 601). Here we test whether this is indeed the case. We examine how well several model outputs fit measured temperature and rainfall in many stations around the globe. We also integrate measurements and model outputs over a large part of a continent, the contiguous USA (the USA excluding islands and Alaska), and examine the extent to which models can reproduce the past climate there. We will be referring to this as “comparison at a large scale”.
This paper is a continuation and expansion of Koutsoyiannis et al. (2008). The differences are that (a) Koutsoyiannis et al. (2008) had tested only eight points, whereas here we test 55 points for each variable; (b) we examine more variables in addition to mean temperature and precipitation; and (c) we compare at a large scale in addition to point scale. The comparison methodology is presented in the next section.
While the study of Koutsoyiannis et al. (2008) was not challenged by any formal discussion papers, or any other peer-reviewed papers, criticism appeared in science blogs (e.g. Schmidt, 2008). Similar criticism has been received by two reviewers of the first draft of this paper, hereinafter referred to as critics. In both cases, it was only our methodology that was challenged and not our results. Therefore, after presenting the methodology below, we include a section “Justification of the methodology”, in which we discuss all the critical comments, and explain why we disagree and why we think that our methodology is appropriate. Following that, we present the results and offer some concluding remarks.
Here’s the models they tested:
Comparison at a large scale
We collected long time series of temperature and precipitation for 70 stations in the USA (five were also used in the comparison at the point basis). Again the data were downloaded from the web site of the Royal Netherlands Meteorological Institute (http://climexp.knmi.nl). The stations were selected so that they are geographically distributed throughout the contiguous USA. We selected this region because of the good coverage of data series satisfying the criteria discussed above. The stations selected are shown in Fig. 2 and are listed by Anagnostopoulos (2009, pp. 12-13). 
Fig. 2. Stations selected for areal integration and their contribution areas (Thiessen polygons).
In order to produce an areal time series we used the method of Thiessen polygons (also known as Voronoi cells), which assigns weights to each point measurement that are proportional to the area of influence; the weights are the “Thiessen coefficients”. The Thiessen polygons for the selected stations of the USA are shown in Fig. 2.
The annual average temperature of the contiguous USA was initially computed as the weighted average of the mean annual temperature at each station, using the station’s Thiessen coefficient as weight. The weighted average elevation of the stations (computed by multiplying the elevation of each station with the Thiessen coefficient) is Hm = 668.7 m and the average elevation of the contiguous USA (computed as the weighted average of the elevation of each state, using the area of each state as weight) is H = 746.8 m. By plotting the average temperature of each station against elevation and fitting a straight line, we determined a temperature gradient θ = -0.0038°C/m, which implies a correction of the annual average areal temperature θ(H – Hm) = -0.3°C.
The annual average precipitation of the contiguous USA was calculated simply as the weighted sum of the total annual precipitation at each station, using the station’s Thiessen coefficient as weight, without any other correction, since no significant correlation could be determined between elevation and precipitation for the specific time series examined.
We verified the resulting areal time series using data from other organizations. Two organizations provide areal data for the USA: the National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA). Both organizations have modified the original data by making several adjustments and using homogenization methods. The time series of the two organizations have noticeable differences, probably because they used different processing methods. The reason for calculating our own areal time series is that we wanted to avoid any comparisons with modified data. As shown in Fig. 3, the temperature time series we calculated with the method described above are almost identical to the time series of NOAA, whereas in precipitation there is an almost constant difference of 40 mm per year. 
Fig. 3. Comparison between areal (over the USA) time series of NOAA (downloaded from http://www.ncdc.noaa.gov/oa/climate/research/cag3/cag3.html) and areal time series derived through the Thiessen method; for (a) mean annual temperature (adjusted for elevation), and (b) annual precipitation.
Determining the areal time series from the climate model outputs is straightforward: we simply computed a weighted average of the time series of the grid points situated within the geographical boundaries of the contiguous USA. The influence area of each grid point is a rectangle whose “vertical” (perpendicular to the equator) side is (ϕ2 – ϕ1)/2 and its “horizontal” side is proportional to cosϕ, where ϕ is the latitude of each grid point, and ϕ2 and ϕ1 are the latitudes of the adjacent “horizontal” grid lines. The weights used were thus cosϕ(ϕ2 – ϕ1); where grid latitudes are evenly spaced, the weights are simply cosϕ.
It is claimed that GCMs provide credible quantitative estimates of future climate change, particularly at continental scales and above. Examining the local performance of the models at 55 points, we found that local projections do not correlate well with observed measurements. Furthermore, we found that the correlation at a large spatial scale, i.e. the contiguous USA, is worse than at the local scale.
However, we think that the most important question is not whether GCMs can produce credible estimates of future climate, but whether climate is at all predictable in deterministic terms. Several publications, a typical example being Rial et al. (2004), point out the difficulties that the climate system complexity introduces when we attempt to make predictions. “Complexity” in this context usually refers to the fact that there are many parts comprising the system and many interactions among these parts. This observation is correct, but we take it a step further. We think that it is not merely a matter of high dimensionality, and that it can be misleading to assume that the uncertainty can be reduced if we analyse its “sources” as nonlinearities, feedbacks, thresholds, etc., and attempt to establish causality relationships. Koutsoyiannis (2010) created a toy model with simple, fully-known, deterministic dynamics, and with only two degrees of freedom (i.e. internal state variables or dimensions); but it exhibits extremely uncertain behaviour at all scales, including trends, fluctuations, and other features similar to those displayed by the climate. It does so with a constant external forcing, which means that there is no causality relationship between its state and the forcing. The fact that climate has many orders of magnitude more degrees of freedom certainly perplexes the situation further, but in the end it may be irrelevant; for, in the end, we do not have a predictable system hidden behind many layers of uncertainty which could be removed to some extent, but, rather, we have a system that is uncertain at its heart.
Do we have something better than GCMs when it comes to establishing policies for the future? Our answer is yes: we have stochastic approaches, and what is needed is a paradigm shift. We need to recognize the fact that the uncertainty is intrinsic, and shift our attention from reducing the uncertainty towards quantifying the uncertainty (see also Koutsoyiannis et al., 2009a). Obviously, in such a paradigm shift, stochastic descriptions of hydroclimatic processes should incorporate what is known about the driving physical mechanisms of the processes. Despite a common misconception of stochastics as black-box approaches whose blind use of data disregard the system dynamics, several celebrated examples, including statistical thermophysics and the modelling of turbulence, emphasize the opposite, i.e. the fact that stochastics is an indispensable, advanced and powerful part of physics. Other simpler examples (e.g. Koutsoyiannis, 2010) indicate how known deterministic dynamics can be fully incorporated in a stochastic framework and reconciled with the unavoidable emergence of uncertainty in predictions.
h/t to WUWT reader Don from Paradise
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.

There were some AGW defenders claiming that the models had to be great because they matched history. (I discounted that, assuming that even an idiot will compare their model output with real data before making wildarse projections, and then, if it doesn’t match, either figure out what’s wrong with the model or begin introducing “parameters” that will make it match history.)
These boys were not only incompetent, (or blinded by agenda) or REALLY lazy !
John Brookes says:
December 6, 2010 at 6:39 am
“OK. There are some systems which are complex and which we do understand.
If you add a little bit of extra sunshine each day, and if the sun gets a bit higher in the sky each day, then over time you might expect the weather to get warmer.”
Defenders of climate modelling regularly make this point as though it actually supports their case. In fact it just highlights the weaknesses. “The system” can be seen as having
1. Deterministic features which have been well understood for a few centuries and have been subjected to empirical tests countless times (earth’s orbit around the sun, its rotation, precession ..)
2. Everything else, not well understood at all (but apparently a highly complex chaotic system, of the sort that does not allow prediction) and no serious empirical support for any theory.
So when you say that the long term behaviour is predictable, eg hotter in summer than winter, you are referring to 1. which says nothing at all about 2., which is where any man-made effects are to be found.
anna v
Very good point you’ve been making about anomalies, rather than actual temperatures, that they can be used to obscure failures of the models (and in climate science if anything can be used to obscure something, it gets used). If you redraw the plots at the top of the post in terms of anomalies it would look as though there was good agreement. (This is in addition to the use of anomalies in making very small changes appear large)
Darkinbad the Brightdayler says:
December 6, 2010 at 11:43 am
I think the purpose of these models (and others) is to persuade, not to predict. If you get the persuasion right then the prediction is not important.
anna v: “But I am sure they did have these studies. That is why they came up with the brilliant idea of anomalies, as I discuss above.”
Many thanks, anna v, I’ve been bothered by the use of “anomalies” right from the start: who really cares about 10ths or 100ths of degrees? Only if they are compared to 0 degrees?
The use of anomalies causes serious problems in understanding north-south (& continental-maritime) terrestrial asymmetry in temperature-precipitation relations. It is the freezing point of water that flips the correlations upside-down over a very large portion of (for the most notable example) the northern hemisphere. Anomalies are useful for some data exploration purposes, but Simpson’s Paradox is guaranteed for researchers ignoring the freezing point of water when spatiotemporally integrating patterns across disparate geographic seasons/elevations/regions.
Question: Have any of the IPCC endorsed computer projections ever come true?
Anyone?
Relevant to this discussion is the recent Issac Newton Institute Seminar that looked at Stochastic Methods in Climate Modelling (http://www.newton.ac.uk/programmes/CLP/clpw01p.html). The opening address is interesting (particularly the second half) with the abstract, .PPT and .AVI available at http://www.newton.ac.uk/programmes/CLP/seminars/082310001.html.
Palmer makes three major points for why non-linear stochastic-dynamic methods should be introduced to climate modeling: Firstly (as noted on this thread) “climate model biases are still substantial, and may well be systemically related to the use of deterministic bulk-formula closure. … Secondly, deterministically formulated climate models are incapable of predicting the uncertainty in their predictions; and yet this is a crucially important prognostic variable for societal applications. ….. Finally, the need to maintain worldwide a pool of quasi-independent deterministic models … does not make efficient use of the limited human and computer resources available worldwide for climate model development.” (from abstract)
I appreciate the comments about “anomalies”. As a Greek, I can tell you, in addition, that even the term “anomaly” is illegitimate and suggestive of poor knowledge of Nature (and of language). “Anomaly” is Greek for “abnormality”. It is not abnormal to deviate from the mean. That is why we avoid this term in our paper (except a couple of times in quotation marks).
HAS.
Thanks, I’ve been asking folks to watch that video for some time.
Darkinbad says: …So the fact that a model doesn’t fit is not, in itself a failing.
Yes, in fact it is. If the glove doesn’t fit, you can’t convict.
Juraj V. says:
December 6, 2010 at 3:30 am
Can anyone run the models backwards? I want to see, how it matches
a) CET record
http://climate4you.com/CentralEnglandTemperatureSince1659.htm
b) Greenland ice core record
http://www.ncdc.noaa.gov/paleo/pubs/alley2000/alley2000.gif
########
huh, you dont run it backwards. You pick a point in time. You set the inputs for historical forcing. You let it run forward. you check.
1. The knowledge of historical forcings is uncertain.
2. The initial “state” of the ocean is going to be wrong. ( ie what water was going where)
So you are never going to get answers that will match a single location. You will if you run the model enough times or run collections of models get a spread of data that encompasses the means of large areas. Sorry that’s the best the science can do.
Yes. The use of anomalies rather than degrees is very troublesome and always a product of interpretation, refinement or alteration of actual data.
“”””” Demetris Koutsoyiannis says:
December 6, 2010 at 2:38 pm
I appreciate the comments about “anomalies”. As a Greek, I can tell you, in addition, that even the term “anomaly” is illegitimate and suggestive of poor knowledge of Nature (and of language). “Anomaly” is Greek for “abnormality”. It is not abnormal to deviate from the mean. That is why we avoid this term in our paper (except a couple of times in quotation marks). “””””
Well Greek is; as they say; all Greek to me, but even I know that “anomaly” means something that isn’t supposed to be.
So in that sense, when Mother Gaia plots an “anomaly” function; or plots a global “anomaly” map like SSTs she gets these boring pictures with absolutely nothing on them in the way of slopes trends gradients or the like; everything is flat; because Mother gaia never gets anything that isn’t supposed to be.
But I do understand why people report them for some data; because it is a bit silly to be plotting 1/100ths of a degree C on a daily Temperature graph that has a full scale range from about -90 C to about + 60 C. Trying to see parts in 15,000 on a graph is not too easy on the eyes. It’s also part of the reason, I think the whole question of global temperature changes is just a storm in a teacup.
It seems that “anomalies” are somewhat akin to first derivatives; and that is always a noise increasing strategy.
If one cannot predict each of the variables, such as ocean oscillations, volcanoes, solar flares, and cloud amounts, just to name a few, that affect the climate, then prediction of the climate itself is impossible.
On another note, do we know whether or not forecast verificaions have been attempted using a starting point, say 50 years ago?
anna v, that’s a very good point about anomalies. I hadn’t thought about them removing the distinction of intensity, but you’re dead-on right. I always took them as the modelers’ way of subtracting away the mistakes. Of course, that rationale is a crock, too, but at least it has a superficial plausibility.
I remember looking at the CMIP GCM control runs that show baseline global air temperature spread across 5 degrees, and wondering how anyone could credit models that have such disparate outputs. And then seeing how use of anomalies just subtracts all that variation away, as though it never existed.
Antonis, thanks for the new link. The “2010” link just took me back to the same GCM study, and I didn’t realize there was a newer version of the chaotic model paper. Congratulations to you as well on the great work you all are doing.
Several commentors have critisized the subject paper because it uses only one region, not the whole earth.
However, it is my understanding that the models don’t treat the whole globe at one time. But rather break it down, into a grid. If they can’t get each grid (region) right, how can the results for whole globle be right? Unless people are suggesting that the grid errors cancel out, so total is right.
Therefore, it seems to me that, using only a region to check the model makes perfect sense. I am realitively new at trying to understand, all the arguements about the validity of the gobal temperature predicting models, so if I have it wrong, someone please correct me.
Anna,
I’d like to add my ‘voice’ to the chorus of appreciation. Like others here I was also always troubled by the use of anomalies, and now I know why. Your explanation of how they have been used was one of those ‘lightbulb’ moments for me. Where are all the defenders of anomalies? Usually they hone in on these arguments like a swarm of killer bees.
To Drs. Anagnostopoulos, , Koutsoyiannis, , Christofides, , Efstratiadis, and Mamassis (thank God for cut-and-paste) thank you all for this invaluable addition to the scientific arsenal of peer-reviewed criticism of CAGW. Your paper’s findings are not exactly unexpected, but offer the kind of solid confirmation we need for what skeptics have been arguing for a couple of decades.
wsbriggs says: (December 6, 2010 at 6:25 am) To this extent, I’d like to rename the entire AGW mess to Climate Bubble.
I’ll drink to that.
James Macdonald says:
December 6, 2010 at 5:03 pm
…”On another note, do we know whether or not forecast verificaions have been attempted using a starting point, say 50 years ago?’
I second the question, only suggest 60 years (a full ENSO cycle) is a better bracket, but why not 120 as their proxys are so accurate?
George E. Smith says: Izzere
I felt quite humbled when I saw this and realised I had never even heard of this famous person or…
Oh.
It is good that people start noticing how non-sense anomalies are.
A derivative, gives the rate of change. It would be delta(T)/delta(time) and has that meaning, the rate of change. Taking the nominator only and thinking that it is legal to use it as an energy correlated variable is non-sense.
I am copying comments I made on another board because I have put some numbers in.
We have been bamboozled with anomalies, and worse, with global average anomalies.
The Power radiated is j=sigma*T^4, not anomaly^4, and it is power/energy that makes or breaks the temperatures.
A 15K anomaly in the antarctic is not in energy content the same as a 15K anomaly in the tropics 🙂 . If one were negative and the other positive the average would be 0, but the tropics would be melting from the amount of energy falling and leaving.
Lets put some numbers.
Let the antarctic be at 265K and do an anomaly of 15 for the month of november, 280K.
That gives 348-280=68watts/m^2 energy “anomaly”
Let the tropics be at 304CK and now an anomaly of 319K
That gives 587-484= 103 watts/m^2
such large variations in radiated energy happen not only geographically but at the same spot, day and night, shadow and clear.
In addition the temperatures used are the 2meter height atmospheric temperatures, whereas most radiation comes from the solid ground that can be 15C different between ground and air due to convection effects and shadow effects, from the poles to the deserts.
If one goes into gray body constants,
the fractal nature of the surfaces, the way the programs integrate over the globe and God only knows what else the relevance of anomalies to whether the planet is heating or cooling seems to me not at all established.
I think those who keep advocating the use of the sea surface temperatures as the world’s thermometer are right.
davidc says: (December 6, 2010 at 1:03 pm) to anna v: Very good point you’ve been making about anomalies,
Pleased to see you pick this up, David. Anna’s point struck me as a secret waiting to be told, and I hope others will pursue it. Could it be that complexity has obscured simplicity? Perhaps even by intent?
It occurs to me that if computers even super ones cant pick a winner from a horse race or predict the stock market what chance is there they can predict the future climate infinitely more complex methinks. I am reminded of the fools who send money for the sure system to pick the winner horse or stock!
Anna V:
We are flattered. Does ten years make such a difference?
Professor Koutsoyiannis has not been making this point. On the contrary, he has been calling the distinction between a deterministic and a random part a “false dichotomy” and a “naïve and inconsistent view of randomness”. He has been insisting that randomness and determinism “coexist in the same process, but are not separable or additive components” (A random walk on water, p. 586).
Edward Bancroft:
Even if we are able to model all factors and input all correct state data, we will still be unable to predict the future. Let me repeat that: Even if we are able to model all factors and input all correct state data, we will still be unable to predict the future. See “A random walk on water” linked to above for an enlightening in-depth investigation of the issue, or the epilogue of my HK Climate site for an overview.
On the “anomaly” issue:
We don’t say that you should never use departures from the mean. We say you might use them, but with care and only when you really know what you are doing. See subsection “Comparison of actual values rather than departures from the mean” in Anagnostopoulos et al. (p. 1099) on that (our argumentation is similar to Anna V’s above).
We also say: don’t call them “anomalies”. It is extremely confusing to use “anomaly” for something that is perfectly normal.
Old engineer:
We explore this question in subsection “Scale of comparison” in Anagnostopoulos et al. (p. 1097).