Guest Post by Willis Eschenbach
Among the papers in the Copernicus Special Issue of Pattern Recognition in Physics we find a paper from R. J. Salvador in which he says he has developed A mathematical model of the sunspot cycle for the past 1000 yr. Setting aside the difficulties of verification of sunspot numbers for say the year 1066, let’s look at how well their model can replicate the more recent record of last few centuries.
Figure 1. The comparison of the Salvador model (red line) and the sunspot record since 1750. Sunspot data is from NASA, kudos to the author for identifying the data.
Dang, that’s impressive … so what’s not to like?
Well, what’s not to like is that this is just another curve fitting exercise. As old Joe Fourier pointed out, any arbitrary wave form can be broken down into a superposition (addition) of a number of underlying sine waves. So it should not be a surprise that Mr. Salvador has also been able to do that …
However, it should also not be a surprise that this doesn’t mean anything. The problem is that no matter how well we can replicate the past with this method, it doesn’t mean that we can then predict the future. As the advertisements for stock brokers say, “Past performance is no guarantee of future success”.
One interesting question in all of this is the following: how many independent tunable parameters did the author have to use in order to get this fit?
Well, here’s the equation that he used … the sunspot number is the absolute value of
Figure 2. The Salvador Model. Unfortunately, in the paper he does not reveal the secret values of the parameters. However, he says you can email him if you want to know them. I passed on the opportunity.
So … how many parameters is he using? Well, we have P1, P2, P3, P4, F1, F2, F3, F4, N1, N2, N3, N4, N5, N6, N7, N8, L1, L2, L3, and L4 … plus the six decimal parameters, 0.322, 0.316, 0.284, 0.299, 0.00501, and 0.0351.
Now, that’s twenty tunable parameters, plus the six decimal parameters … plus of course the free choice of the form of the equation.
With twenty tunable parameters plus free choice of equation, is there anyone who is still surprised that he can get a fairly good match to the past? With that many degrees of freedom, you could make the proverbial elephant dance …
Now, could it actually be possible that his magic method will predict the future? Possible, I suppose so. Probable? No way. Look, I’ve done dozens and dozens and dozens of such analyses … and what I’ve found out is that past performance is assuredly no guarantee of future success.
So, is there a way to determine if such a method is any good? Sure. Not only is there such a method, but it’s a simple method, and we have discussed the method here on WUWT. And not only have we discussed the testing method, we’ve discussed the method with various of the authors of the Special Issue … to no avail, so it seems.
The way to test this kind of model is bozo-simple. Divide the data into the first half and the second half. Train your model using only the first half of the data. Then see how it performs on the second half, what’s called the “out of sample” data.
Then do it the other way around. You train the model on the second half, and see how it does on the first half, the new out-of-sample data. If you want, as a final check you can do the training on the middle half, and see how it works on the early and late data.
I would be shocked if the author’s model could pass that test. Why? Because if it could be done, it could be done easily and cleanly by a simple Fourier analysis. And if you think scientists haven’t tried Fourier analysis to predict the future evolution of the sunspot record, think again. Humans are much more curious than that.
In fact, the Salvador model shown in Figure 2 above is like a stone-age version of a Fourier analysis. But instead of simply decomposing the data into the simple underlying orthogonal sine waves, it decomposes the data into some incredibly complex function of cosines of the ratio of cosines and the like … which of course could be replaced by the equivalent and much simpler Fourier sine waves.
But neither one of them, the Fourier model or the Salvador model, can predict the future evolution of the sunspot cycles. Nature is simply not that simple.
I bring up this study in part to point out that it’s like a Fred Flintstone version of a Fourier analysis, using no less than twenty tunable parameters, that has not been tested out-of-sample.
More importantly, I bring it up to show the appalling lack of peer review in the Copernicus Special Issue. There is no way that such a tuned, adjustable parameter model should have been published without being tested using out of sample data. The fact that the reviewers did not require that testing shows the abysmal level of peer review for the Special Issue.
UPDATE: Greg Goodman in the comments points out that they appear to have done out-of-sample tests … but unfortunately, either they didn’t measure or they didn’t report any results of the tests, which means the method is still untested. At least where I come from, “test” in this sense means measure, compare, and report the results for the in-sample and the out-of-sample tests. Unless I missed it, nothing like that appears in the paper.
NOTE: If you disagree with me or anyone else, please QUOTE WHAT YOU DISAGREE WITH, and let us know exactly where you think it went off the rails.
NOTE: The equation I show above is the complete all-in-one equation. In the Salvador paper, it is not shown in that form, but as a set of equations that are composed of the overall equation, plus equations for each of the underlying composite parameters. The Mathematica code to convert his set of equations into the single equation shown in Figure 2 is here.
BONUS QUESTION: What the heck does the note in Figure 1 mean when it says “The R^2 for the data from 1749 to 2013 is 0.85 with radiocarbon dating in the correlation.”? Where is the radiocarbon dating? All I see is the NASA data and the model.
BONUS MISTAKE: In the abstract, not buried in the paper but in the abstract, the author makes the following astounding claim:
The model is a slowly changing chaotic system with patterns that are never repeated in exactly the same way.
Say what? His model is not chaotic in the slightest. It is totally deterministic, and will assuredly repeat in exactly the same way after some unknown period of time.
Sheesh … they claim this was edited and peer reviewed? The paper says:
Edited by: N.-A. Mörner
Reviewed by: H. Jelbring and one anonymous referee
Ah, well … as I said before, I’d have pulled the plug on the journal for scientific reasons, and that’s just one more example.