Climate models outperformed by random walks

First, a bit of a primer. Wikipedia describes a random walk is a mathematical formalisation of a trajectory that consists of taking successive random steps. For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the financial status of a gambler can all be modeled as random walks. The term random walk was first introduced by Karl Pearson in 1905.

Example of eight random walks in one dimension starting at 0. The plot shows the current position on the line (vertical axis) versus the time steps (horizontal axis). Source: Wikipedia
Computer models utterly fail to predict climate changes in regions

From the Financial Post: A 2011 study in the Journal of Forecasting took the same data set and compared model predictions against a “random walk” alternative, consisting simply of using the last period’s value in each location as the forecast for the next period’s value in that location.

The test measures the sum of errors relative to the random walk. A perfect model gets a score of zero, meaning it made no errors. A model that does no better than a random walk gets a score of 1. A model receiving a score above 1 did worse than uninformed guesses. Simple statistical forecast models that have no climatology or physics in them typically got scores between 0.8 and 1, indicating slight improvements on the random walk, though in some cases their scores went as high as 1.8.

The climate models, by contrast, got scores ranging from 2.4 to 3.7, indicating a total failure to provide valid forecast information at the regional level, even on long time scales. The authors commented: “This implies that the current [climate] models are ill-suited to localized decadal predictions, even though they are used as inputs for policymaking.”……

More here: http://opinion.financialpost.com/2012/06/13/junk-science-week-climate-models-fail-reality-test/

h/t to WUWT reader Crispin in Waterloo

Previously, WUWT covered this issue of random walks here:

Is Global Temperature a Random Walk?

UPDATE: The paper (thanks to reader MT) Fildes, R. and N. Kourentzes, 2011: Validation and forecasting accuracy in models of climate change. International Journal of Forecasting. doi 10.1016/j.ijforecast.2011.03.008

and is available as a PDF here

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

160 Comments
Inline Feedbacks
View all comments
Burch
June 14, 2012 9:40 am

> how numerical models can be “low skill” over short timescales but could possibly be accurate enough to justify massive intervention over multi-decadal timescales?
Simple really… It’s easy to confirm the (non)accuracy of a model over short timescales. By the time ‘multi-decades’ have passed, no one will remember the model as it will have been replaced a few hundred times. But for now, one can claim the current long-view model to be accurate, secure in the knowledge that no one can prove that it is not.

Jim Clarke
June 14, 2012 9:41 am

In regards to:
A fan of *MORE* discourse says:
June 14, 2012 at 6:22 am
Your analogy of fluid dynamics over a wing is not appropriate for predicting climate. To make it more compatible, you would have to have a wing that changed shape in response to the chaotic flow, so that the next calculation of the model was not just applying a new flow, but a new flow over the new shape predicted in the previous calculation. Any error generated in the computation of the new shape of the wing would be carried forward into the next calculation of the wing shape and so on. Soon…you will have a shape that is nothing like a wing, unless you create a function in the program that ‘forces’ a wing shape over time.
Daily prediction models do this. The further out in time they try to predict, the more powerful climatology becomes in the calculation, otherwise the models would run off into ridiculous solutions beyond 7-10 days.
It has been argued that climate models are different than forecasting models in that they are not dependent on initial conditions. In other words, forecasting models begin with an initial sampling of the global atmosphere and run it out into the future. Climate models, it is argued, are a snapshot of the atmosphere under different conditions, like a snapshot of the turbulent flow of air over a wing at any given wind speed. But this argument ignores the multiple feedbacks inherent in the climate system, particularly clouds and the water cycle. These feedbacks create a ‘changing wing’ in climate models and any calculation of climate over time will be dependent on the previous calculation and subject to every increasing error.
So even if all the assumptions about the atmosphere, Earth, Sun, cosmic rays and so on, were absolutely correct, then the models would only be AS accurate as the random walk models, for they would still be subject to escalating feedback errors.
The fact that the models are so much worse than a random walk is proof that one or more of the assumptions (or equations) are incorrect!

David L.
June 14, 2012 9:52 am

John F. Hultquist says:
June 14, 2012 at 8:10 am
“This post and comments include statements with the terms random and chaotic with regard to weather and climate. I wonder who gets to define the terms, outcomes, and miss or match of the results? For example, if the weather forecaster looks at a computer output and it says the High Temp for tomorrow will be X.23 and then posts X and then the actual High comes as X +2, is that forecast considered wrong?”…
There have been studies into the accuracy of weather predictions and there’s an extensive literature dealing with forecast verification. See for example the paper “Verification of The Weather Channel Probability of Precipitation Forecasts” by J. Eric Bickel and Seong Dae Kim. They used a “distribution-oriented” framework proposed in: Murphy, A. H., and R. L. Winkler,:
Murphy, A. H., and R. L. Winkler, 1977: Reliability of subjective
probability forecasts of precipitation and temperature. Appl.
Stat., 26, 41–47.
——, and H. Daan, 1985: Forecast evaluation. Probability, Statistics,
and Decision Making in the Atmospheric Sciences, A. H.
Murphy, and R. W. Katz, Eds., Westview Press, Inc., 379–437.
——, and R. L. Winkler, 1987: A general framework for forecast
verification. Mon. Wea. Rev., 115, 1330–1338.
——, and ——, 1992: Diagnostic verification of probability forecasts.
Int. J. Forecasting, 7, 435–455.

Jim Clarke
June 14, 2012 9:59 am

Kevin Kinser says:
June 14, 2012 at 8:02 am
…This is a welcome methodological critique of models, not of the science behind them.
From the original paper’s conclusions:
“…there is no support in the evidence we present for those who reject the whole notion of global warming: the forecasts still remain inexorably upward, with forecasts which are comparable to those produced by the models used by the IPCC.”
From the postscript:
“Climate change is a major threat to us all, and this requires the IPCC community of climate modellers to be more open to developments and methods from other fields of research.”
Kevin…I have seen quotes like these in almost every study that weakens the AGW theory. Some studies have produced evidence that directly contradicts the theory, like the study that indicated Antarctic cooling, and then pronounce in the conclusion that it does not contradict the AGW theory and that AGW is an extremely dangerous situation that needs further study.
Are these statements derived from the results of their research? No! They contradict the research! Authors make these statements because they like their friends, their jobs and their paychecks.

Mark Bofill
June 14, 2012 10:06 am

Pielke Sr. has been pointing this (more or less) out for as long as I’ve been reading his blog – climate models show no skill, over and over again. Unfortunately, if they aren’t listening to him, they aren’t going to listen to anybody. It’s pretty sad.

David L.
June 14, 2012 10:14 am

Mark Bofill says:
June 14, 2012 at 10:06 am
Pielke Sr. has been pointing this (more or less) out for as long as I’ve been reading his blog – climate models show no skill, over and over again. Unfortunately, if they aren’t listening to him, they aren’t going to listen to anybody. It’s pretty sad.”
It doesn’t benefit them one bit to listen to an alternative explanation. That doesn’t pay their bills.

Kevin Kinser
June 14, 2012 10:32 am

Jim Clarke says:
June 14, 2012 at 9:59 am
“Are these statements derived from the results of their research? No! They contradict the research!”
The study did not address the theory. It addressed the models. The authors are clarifying this so that people would not misunderstand their conclusions. Both published commentaries touch on this, so the authors emphasize it again in the postscript. It would be going beyond their data to conclude that the science behind the models is incorrect. You may infer this, but it is empirically unjustified based on the research they did.

A fan of *MORE* discourse
June 14, 2012 10:54 am

Jim Clarke says: “Your analogy of fluid dynamics over a wing is not appropriate for predicting climate. To make it more compatible, you would have to have a wing that changed shape in response to the chaotic flow, so that the next calculation of the model was not just applying a new flow, but a new flow over the new shape predicted in the previous calculation.”

Jim, your assertion is just plain wrong-on-the-facts. Existing computational fluid dynamics (CFD) simulations *DO* take into account the nonlinear deformation, both static and dynamic, of wings under aerodynamic loading … this aeroelastic forcing is the CFD analog of climatological forcing.
As documented on Judith Curry’s weblog, aeroelastic forcing can be exceedingly large and grossly nonlinear, and none-the-less be accurately simulated.

Roger Longstaff
June 14, 2012 10:55 am

On her blog, Dr. Tamsin Edwards has kindly given me a reference to The UK Met Office code:
http://cms.ncas.ac.uk/code_browsers/UM4.5/UMbrowser/
“This version of the Met Office model (v4.5, HadCM3, 1999) was used for the UK Climate Projections: UM 4.5 code.
[Edit 17:12 – This is also the version we are using in our estimate of climate sensitivity.] Similar versions (such as lower resolution) have been used by climateprediction.net. It is available for academic use, subject to signing a licence agreement. This version is much faster than the current generation so is still used a lot for large groups (ensembles) of simulations, palaeoclimate studies and other areas where you need a lot of, or long, simulations. See for example the Met Office and climateprediction.net model pages.
The Unified Model is now up to about version 8.2, I think, for operational weather forecasting. IPCC runs for AR5 were done with v6.6 (HadGEM2-ES).”
I thank Dr. Edwards for this, as it is something that I have requested for a long time.

June 14, 2012 11:04 am

A fan of *MORE* discourse . You write “Yet global climate changes are constrained by strict conservation of mass and energy and global increase in entropy, and thus *CAN* be predicted.”
I see others have been ahead of me in pointing out that this is utter garbage. It will be interesting to see whether you try and respond to them, here on WUWT, as you did with me on Climate Etc. I suspect on this forum, you will simply disappear.

CTL
June 14, 2012 11:10 am

That’s a nice graph you have there. Very evenly splayed out. Did you pick it yourself? 🙂
Go on, tell us. How many runs DID you try until you found that nice one?
johnwdarcher,
The graph that Anthony included in the post is the example graph from the Wikipedia article that he linked to which explains what a Random Walk is.

Earle Williams
June 14, 2012 11:16 am

johnwdarcher,
Click on the link to the wikipedia article for Random Walk: http://en.wikipedia.org/wiki/Random_walk
By the way, while walking into my building this morning, I passed a car with a the tag number of FTH 336. What are the odds of that?

Earle Williams
June 14, 2012 11:18 am

Anthony and/or mods, you may want to note that the graph in the post is sourced from Wikipedia, to put silly speculation like that from johnwdarcher to rest..

RockyRoad
June 14, 2012 11:57 am

mt says:
June 14, 2012 at 7:59 am

@RockyRoad read the paper. As far as I can tell, they’re giving scores based on looking at future data (2007-2017 and 2007-2027). How is it possible to say what model is “right” and “wrong” versus data that doesn’t yet exist?

Last time I looked at my calendar, “2007” is a number less than “2012”. By 5, to be exact.

June 14, 2012 12:29 pm

Jim Clarke says:
June 14, 2012 at 9:41 am
The fact that the models are so much worse than a random walk is proof that one or more of the assumptions (or equations) are incorrect!

This has not been shown, so this ‘proof’ fails!

Ian W
June 14, 2012 12:56 pm

George E. Smith; says:
June 14, 2012 at 9:08 am
“””””…..ferd berple says:
June 14, 2012 at 7:07 am
Philip Richens says:
June 14, 2012 at 2:27 am
Pretty certain that what RB has in mind is that (A) short term random noise (white noise) prevents models from making accurate forecasts over weeks and months and (B) the white noise is averaged out over climate time scales (decades)
=========
Correct, the climate models make the assumption that climate is normally distributed (constant mean and deviation), that the errors plus and minus average out over time……”””””
Well white noise may average out to zero after long time scales; but only in mathematical models, which have a Gaussian distribution. Physical systems don’t have such an ultimate error distribution. Most of them eventally show the presence of 1/f noise, where the amplitude of an error or step, can grow without limit, but at an occurrence frequency that is inversely proportional to the size of the error or step. One can make the argument that 1/f noise is a consequence of Heisenberg’s Principle of Uncertainty. Maybe the Big Bang, was simply the bottom end of the 1/f noise spectrum.
So in real physical systems, noise does not average out to zero no matter how long you wait, because there is always the chance of a step much greater than anything previously seen. And no; 1/f noise does not violate any thermodynamic limits, such as the total power or energy growing without limit, which it would in a white noise system. It is simple mathematics, to show that with 1/f noise each octave of frequency range contains the same amount of power as any other, no matter how large the amplitude gets.

Perhaps then the exercise should be repeated with a ‘Levy Flight’ random walk with fatter tails to the Gaussian distribution?

Jim Clarke
June 14, 2012 1:32 pm

Kevin Kinser says:
June 14, 2012 at 10:32 am
“It would be going beyond their data to conclude that the science behind the models is incorrect. You may infer this, but it is empirically unjustified based on the research they did.”
I guess what you and the authors are arguing is that the ‘science’ and the ‘models’ are two very different things, and that the science could be right even if the models are wrong! I guess that is possible, but it also creates quite the conundrum, because the reverse is also true. The models could be right, even if the science is wrong (as was the case with epicycles and celestial movement).
In order to use the scientific method, however, we have to test the theory with prediction and then see if the prediction holds. In the case of climate change, the models are the prediction. If we can not equate the models with the prediction derived from the theory, then there is no way to falsify the theory and it is not science.
So we have two choices:
1. We can adhere to the scientific method, consider the models are giving us a prediction of the theory and, as the models fail to make accurate predictions, conclude that there is a problem with the theory and that it needs to be revamped.
2. We can conclude that as the models fail to verify, that the theory is still strong and there is something wrong with the mechanics in the software, even though there is no evidence for a robust theory.
You and the authors are asking all the world to join a cult and pay hefty dues. I feel quite rational and justified in calling the theory into question based on the performance of the models.

RockyRoad
June 14, 2012 2:14 pm

Phil. says:
June 14, 2012 at 12:29 pm

Jim Clarke says:
June 14, 2012 at 9:41 am
The fact that the models are so much worse than a random walk is proof that one or more of the assumptions (or equations) are incorrect!
This has not been shown, so this ‘proof’ fails!

But nowhere, Phil, have you been able to show that the [climate] models are BETTER!
That’s the crux of this whole issue.

Zeke
June 14, 2012 2:25 pm

It is true that GCMs forecast catastrophe by taking tiny CO2 molecular inputs and amplifying them with water vapor and clouds in mysterious ways. Suggesting ways to further “tune” the models to reality, and to get forecasts which are at least as likely to match regional weather events as a “random walk,” is going to fall on deaf ears. Because they still need to make a mountain out of a molehill-inator.
http://zekeunlimited.wordpress.com/2012/06/14/dr-doofenschmurtz-evil-inc-unveils-the-mountain-out-of-a-molehill-inator-plans-to-take-over-the-entire-tri-state-area/

June 14, 2012 2:31 pm

CTL,
The graph that Anthony included in the post is the example graph from the Wikipedia…” [CTL]
Whoops! Thanks. I missed that. So it was Wiki that cherry picked the graph, not Anthony.
Not that there’s anything wrong in doing that though. One would naturally choose a ‘nice’ example for illustrative purposes so that the general features were brought out clearly.
Earle Williams,
By the way, while walking into my building this morning, I passed a car with a the tag number of FTH 336. What are the odds of that?” [EW]
Depends. But seeing as it was near your building there’s probably a fair chance that it belongs to one of your colleagues or someone else who works close by so I’d guess the probability was pretty high, and higher if you are habitually late. But it might have been a client’s or customer’s for all I know, so it depends on how often he drops by and what time he likes to arrive. And so on for other possibilities. But you know your environment far better I do.
Monitor it for a week or a fortnight though. That should give you a rough idea. But it strikes me as little odd that you bother with such distractions. Try to focus on things that are more important to you is my advice.
…that graph in the post is sourced from Wikipedia, to put silly speculation like that from johnwdarcher to rest…” [EW]
“Silly”? Never mind my mistaken attribution, there was nothing silly about my speculation. 100 to 1 says that graph was cherry-picked  by whoever did it. There were only 8 random walks in it. That’s far too low a number to get a nice symmetric fan shape like the one shown just by chance. If you don’t believe me, run a few simulations yourself and see how many runs you need to get a ‘nice’ one like Wiki’s.
Randomness generally produces ugly patterns if left to itself when the number of ‘options’ available is small. For example, if you ever decide to tile a wall with a random pattern made with tiles of, say, only four different colours then you’d be well advised not to actually place them at random but rather to plan ahead and tweak the pattern. If you don’t you’ll end up with noticeably large ugly sub-configurations of the same colour. Again, if you don’t believe me run a simulation yourself on a grid of, say, 20 x 20 cells.
Finally, a question for you: did you bother to look at the image in the link I gave?

Roger Longstaff
June 14, 2012 3:12 pm

The user guide to the Met Office code has been kindly supplied by Dr. Edwards. It states (p37) that Fourier filtering is available AT EACH TIMESTEP to reduce noise and instability:
http://cms.ncas.ac.uk/index.php/component/docman/doc_details/34-um-user-guide

Duster
June 14, 2012 3:22 pm

old construction worker says:
June 14, 2012 at 12:16 am
Climate Models: If a frog had wings,……….

Would they be pigs?

June 14, 2012 3:28 pm

[2nd attempt]
CTL,
The graph that Anthony included in the post is the example graph from the Wikipedia…” [CTL]
Whoops! Thanks. I missed that. So it was Wiki that cherry picked the graph, not Anthony.
Not that there’s anything wrong in doing that though. One would naturally choose a ‘nice’ example for illustrative purposes so that the general features were brought out clearly.
Earle Williams,
By the way, while walking into my building this morning, I passed a car with a the tag number of FTH 336. What are the odds of that?” [EW]
Depends. But seeing as it was near your building there’s probably a fair chance that it belongs to one of your colleagues or someone else who works close by so I’d guess the probability was pretty high, and higher if you are habitually late. But it might have been a client’s or customer’s for all I know, so it depends on how often he drops by and what time he likes to arrive. And so on for other possibilities. But you know your environment far better I do.
Monitor it for a week or a fortnight though. That should give you a rough idea. But it strikes me as little odd that you bother with such distractions. Try to focus on things that are more important to you is my advice.
…that graph in the post is sourced from Wikipedia, to put silly speculation like that from johnwdarcher to rest…” [EW]
“Silly”? Never mind my mistaken attribution, there was nothing silly about my speculation. 100 to 1 says that graph was cherry-picked  by whoever did it. There were only 8 random walks in it. That’s far too low a number to get a nice symmetric fan shape like the one shown just by chance. If you don’t believe me, run a few simulations yourself and see how many runs you need to get a ‘nice’ one like Wiki’s.
Randomness generally produces ugly patterns if left to itself when the number of ‘options’ available is small. For example, if you ever decide to tile a wall with a random pattern made with tiles of, say, only four different colours then you’d be well advised not to actually place them at random but rather to plan ahead and tweak the pattern. If you don’t you’ll end up with noticeably large ugly sub-configurations of the same colour. Again, if you don’t believe me run a simulation yourself on a grid of, say, 20 x 20 cells.
Finally, a question for you: did you bother to look at the image in the link I gave?

June 14, 2012 3:54 pm

Why are we referencing Wikipedia? Are we trying to embarrass ourselves? Can we please reference valid sources that cannot be edited by anyone with an Internet connection.

Crispin in Waterloo
June 14, 2012 4:40 pm

@A fan of *MORE* discourse
“As documented on Judith Curry’s weblog, aeroelastic forcing can be exceedingly large and grossly nonlinear, and none-the-less be accurately simulated.”
++++++++
Given this fact, what explanation remains for the inability of the models to generally predict temperature over a decade or longer? To me, it seems to be the math that is supposed to be capturing the physical relationships. If complex modelling works (I use it in combustion analysis and heat transfer which is pretty darned hard) then we can take it that the framework as reasonable. Plugging the relationships into the framework is giving the wrong answer. They must some physical understanding fundamentally wrong.
One plus one = CO2.

Verified by MonsterInsights