Guest Essay by Geoffrey H Sherrington
………………………………………………………..
This short note was inspired by Viscount Monckton writing on 13th June on WUWT about the leveling of global temperatures since 1997 or so, and the increasing mismatch with a number of climate models.
This graph by Drs Christy & Spencer of UAH is referenced in Anthony’s article ‘The Ultimate “Sceptical Science Cherry Pick’ of 10th June and will be referred to as ‘the spaghetti graph’.
In the Viscount Monckton article there was a significant contribution by rgbatduke now elevated to a post titled The “ensemble” of models is completely meaningless, statistically whose comments can be read with reference to this spaghetti graph (with thanks to the authors of it).
Basically rgbatduke argued that model methods to create spectra for chemical elements such as carbon had limitations; that there was a history of improvement of models; the average of such successive models was meaningless; they did not succeed without some computational judgment; and even then, they were not as good as the measured result. The same comments should be applied to the various climate model comparisons shown in the spaghetti graph, particularly the meaningless average. Climate models were stated to be far more complex than atomic spectral calculations.
Here is another model graph, one from my files.
The numerical raw data are here, with some early stage statistics derived from the raw data. I make no claims about the number of significant figures carried, the distributions of data, etc. The purpose of this essay is to invite your comments.
The graph has time on the X-axis. For this exercise, the interval does not matter except to note that data are equally spaced in time. The Y-axis has a dimensionless model score and is integral. It can increment by 0, 2 or 4 units at a time (like ‘no change= 0’, ‘some change =2 units’, much change = 4 units.’) The dark dots are the arithmetic average of each time slice.
You will see that the first part only of the graph is shown. The object of the exercise is to use the information content of the shown data, to calculate projections of each of the series out to 23 time spans. The correct answer, as derived from experiment, is known. It is with the WUWT team.
If, as you work, you seek more information, then please ask. No reasonable requests refused.
If, as you work, you feel you know the source of the data, please don’t tell the others.
Finally, what is the purpose of all of this? Answer is, to try to emulate a simple climate model projection. I do not know if all climate models follow the same routines to get from start to finish, whether they calculate year by year, similar to this example, or if the whole lot goes into a series of matrixes that are solved one after another.
However, consider that for this exercise you have 18 models that have each yielded 20 years of data. Let us all see how well you can project to end of year 23. In a week I’ll post the full graph and full set of figures, then I’ll make some comments on your methods of solving this problem.
As will, I hope, a few others.
Repeating: The exercise is to project the data to the end of time period 23.
That’s 3 more slots.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
42
Do you have that data as a .csv or .txt file?
I would like to enter the contest, but I don’t understand the data. You say there are 18 different series and you’re looking for a projection for each at time 23. That’s 18 projections in all.
— What’s the relationship among the 18 series?
— Why is the average meaningful?
— Is there any reason to believe that each of the series will grow at comparable rates?
— Would it be reasonable to guess that each series grows sort-of linearly? That is, the series that have grown faster so far will continue to grow faster?
I think I see one point. The total has grown exactly 2 units per time period. It’s tempting to predict that it will continue to do so. However, if the series have nothing to do with each other, then the steady rate of growth of their average is likely just a coincidence.
I don’t see much point in this exercise. Without knowing anything about “physics” involved in this model it’s hard to expect anything but that the evolution will continue at approximately the same cosnstant rate as in the recorded period. Which of course could work unless the real model comes with some kind of effect which will kick in no sooner than at step 21. And that’s what I actually expect to happen, otherwise there would be no point in making this article.
I also don’t see any relation between this exercise and the Spencer/Christy graph and rgbduke’s post.
Surely the ‘average of the models’ has some meaning?
It’s not a real physical meaning, of course, corresponding to any physical phenomenon. But it measures something. It measures what the different modelers think they can get away with in their models.
The main problem here,and the obvious stupidity, which is shared with the concept of running ‘consensus surveys, is that you’re measuring a moving target. Back in the 1990s, a consensus study would probably show a valid 80-90% of papers supporting, and the model predictions would all be steep upward graphs. By now, the model graph’s gradients have come down a lot.
So a moving average of those gradients would show a trend – a trend from high-pitched alarmism down towards a more justifiable exaggeration….
At first glance I would need 8 possible outcomes for the next 3 time steps of each of the 18 series. Besides giving an 8 value range for time interval 23 for each of the 18 series, this has no predictive skill. Averages or trends have no predictive ability on data generated in this manner.
Every number in your table is a multiple of 4, which kind of contradicts the above.
and there are no steps of 8, so this appears to be a simple binary walk multiplied by 4. Why bother with the scaling?
Mike McMillan says:
June 20, 2013 at 12:30 am
42
‘Good-bye, and thanks for all the fish.’
The Y-axis has a dimensionless model score and is integral. It can increment by 0, 2 or 4 units at a time (like ‘no change= 0’, ‘some change =2 units’, much change = 4 units.’)
I don’t see any increments of 2 in the data . Am I being blind or is that an oddity?
Why not use meadian and quartiles?
Dodgy Geezer says: June 20, 2013 at 1:13 am “Surely the ‘average of the models’ has some meaning?”
Well, it’s not a straight line.
……………………………
steveta_uk says:
June 20, 2013 at 1:33 am It can increment by 0, 2 or 4 units at a time
Every number in your table is a multiple of 4, which kind of contradicts the above.
No necessarily. The 2 score might not have appeared yet. Please see below.
………………………
David in Cal says: June 20, 2013 at 12:54 am “What’s the relationship among the 18 series?”
Each point can increment positively by 0, 2, or 4. No negative values.
……………………………
It is obvious that some series grow faster than others. Looking ahead, one might assume that the probability of a change in growth rate for a series is low. But the average grows in rather constant steps. Note that each new point in a series is pegged to the last. That is a constraint on prediction; it makes it likely that the numbers do not suddenly diverge or converge. Using these three guides, can we try to predict the 18 values in the next time slot? Then the next and the next?
These are real data taken from an actual experiment. They are not generated by any algorithm, they are a measure of an energy if that helps, but it is a ranked measure. I have other similar tables, some in which the 2 growth option happens more often. I am leaving space and conditions open in case a point can be illustrated by using another set.
Mike,
Data as .csv, comma delimited.
http://www.geoffstuff.com/Book2.csv
Here is what I did:
I subtracted column 1 from column 20, divided by 20, then times by 3, to establish a rough approximation of the rate for each series (Column 23 =(((U2-B2)/20)*3)+U2). I then filled in column 21 and 22 with data greater or equal to Column 20, but not greater than Column 23, using only increments of 0 or 4. Since I generated some odd numbers in Column 23, these are rounded to the nearest even number, whilst trying to maintaining increments of 4.
The average for Column 23 is 43 (Rounded to 44)
I believe my method is a sufficient approximation of the data, was easy, and I look forward to comparing in part 2 .
Thanks
My “guess” 🙂
http://i44.tinypic.com/2qxynfq.jpg
X Anomaly
Thank you. What I am seeking is a score for each series at the end, rather than an average. Or, failing that, a final ranking of the order of the series from highest to lowest. Geoff.
Geoff Sherrington says: June 20, 2013 at 3:19 am
Data as .csv, comma delimited.
Thank you sir.
Just like on the SAT, only not.
Do I understand correctly that the black line is the data, and the others are from models trying to emulate the data?
F. Fred Singer notes published variations by an order of magnitude between runs of the same GCM. He finds about 400 model – years of runs required to reduce much of the chaotic variation. e.g. 20 runs for a 20 year model, (40 runs for 10 years, 10 runs for 40 years etc.) However, most IPCC models show only one run or are averaged over a few runs to 5 at the most.
Are we to assume that each line is an individual run?
Or are they averaged over multiple runs?
See Overcoming Chaotic Behavior of Climate Models
64
68
68
68
60
56
52
56
48
44
52
40
32
24
24
16
8
8
Column 23
Thank you for the several last posts.
I think that you might need to review the .csv download as I have been having problems with back compatibility. Sorry for the inconvenience. Geoff.
http://www.geoffstuff.com/Book2.csv
Best to use this URL:
http://www.geoffstuff.com/Book2.txt
What I see is individual values rising to some then nearly continuous figure. Actual height differs for each trace and the data is poorly quantised around those lines.
Close to the underlying physical behaviour?
Series Value BE ML
Number t=20 t=23 t=23
1 56 64.2 64
2 60 68.8 68
3 60 68.8 68
4 60 69.5 70
5 52 59.6 60
6 48 55.6 56
7 44 50.3 50
8 48 55.6 56
9 40 46.3 46
10 40 45.7 46
11 44 50.3 50
12 36 41.7 42
13 28 31.8 32
14 20 22.5 22
15 20 23.2 24
16 12 13.9 14
17 8 9.3 10
18 8 9.3 10
Where:
Value at t= 20 is from supplied table
BE at t=23 is best estimate at time step t=23
ML is most likely value at time step t=23
If it was possible to guess at a model with faulty and/or poor understanding of the underlying mechanisms and then project that into the future with any degree of accuracy, these guys wouldn’t be fooling around with climate models but with stock market models.
The forces behind the climate are clearly not understood. Therefore no model can possibly have any accuracy in exrapolation. These models aren’t even good at interpolating the real observed data.
“And my posts have shown and will continue to show that the climate models, using those model means, show no skill at hindcasting. If they show no skill at hindcasting, there’s no reason to believe their projections of future climate.”
‘no skill” is a technical term. Each model has skill.
The simple fact is your can measure the skill of every model. you can measure the skill
of the ensemble.
It matters little whether the ensemble mean has statistical meaning because it has practical
skill, a skill that is better than any individual model. the reason why the ensemble has more skill
is simple. The ensemble of models reduces weather noise and structural uncertainty.
http://en.wikipedia.org/wiki/Forecast_skill