Guest Post by Willis Eschenbach
Among the papers in the Copernicus Special Issue of Pattern Recognition in Physics we find a paper from R. J. Salvador in which he says he has developed A mathematical model of the sunspot cycle for the past 1000 yr. Setting aside the difficulties of verification of sunspot numbers for say the year 1066, let’s look at how well their model can replicate the more recent record of last few centuries.
Figure 1. The comparison of the Salvador model (red line) and the sunspot record since 1750. Sunspot data is from NASA, kudos to the author for identifying the data.
Dang, that’s impressive … so what’s not to like?
Well, what’s not to like is that this is just another curve fitting exercise. As old Joe Fourier pointed out, any arbitrary wave form can be broken down into a superposition (addition) of a number of underlying sine waves. So it should not be a surprise that Mr. Salvador has also been able to do that …
However, it should also not be a surprise that this doesn’t mean anything. The problem is that no matter how well we can replicate the past with this method, it doesn’t mean that we can then predict the future. As the advertisements for stock brokers say, “Past performance is no guarantee of future success”.
One interesting question in all of this is the following: how many independent tunable parameters did the author have to use in order to get this fit?
Well, here’s the equation that he used … the sunspot number is the absolute value of
Figure 2. The Salvador Model. Unfortunately, in the paper he does not reveal the secret values of the parameters. However, he says you can email him if you want to know them. I passed on the opportunity.
So … how many parameters is he using? Well, we have P1, P2, P3, P4, F1, F2, F3, F4, N1, N2, N3, N4, N5, N6, N7, N8, L1, L2, L3, and L4 … plus the six decimal parameters, 0.322, 0.316, 0.284, 0.299, 0.00501, and 0.0351.
Now, that’s twenty tunable parameters, plus the six decimal parameters … plus of course the free choice of the form of the equation.
With twenty tunable parameters plus free choice of equation, is there anyone who is still surprised that he can get a fairly good match to the past? With that many degrees of freedom, you could make the proverbial elephant dance …
Now, could it actually be possible that his magic method will predict the future? Possible, I suppose so. Probable? No way. Look, I’ve done dozens and dozens and dozens of such analyses … and what I’ve found out is that past performance is assuredly no guarantee of future success.
So, is there a way to determine if such a method is any good? Sure. Not only is there such a method, but it’s a simple method, and we have discussed the method here on WUWT. And not only have we discussed the testing method, we’ve discussed the method with various of the authors of the Special Issue … to no avail, so it seems.
The way to test this kind of model is bozo-simple. Divide the data into the first half and the second half. Train your model using only the first half of the data. Then see how it performs on the second half, what’s called the “out of sample” data.
Then do it the other way around. You train the model on the second half, and see how it does on the first half, the new out-of-sample data. If you want, as a final check you can do the training on the middle half, and see how it works on the early and late data.
I would be shocked if the author’s model could pass that test. Why? Because if it could be done, it could be done easily and cleanly by a simple Fourier analysis. And if you think scientists haven’t tried Fourier analysis to predict the future evolution of the sunspot record, think again. Humans are much more curious than that.
In fact, the Salvador model shown in Figure 2 above is like a stone-age version of a Fourier analysis. But instead of simply decomposing the data into the simple underlying orthogonal sine waves, it decomposes the data into some incredibly complex function of cosines of the ratio of cosines and the like … which of course could be replaced by the equivalent and much simpler Fourier sine waves.
But neither one of them, the Fourier model or the Salvador model, can predict the future evolution of the sunspot cycles. Nature is simply not that simple.
I bring up this study in part to point out that it’s like a Fred Flintstone version of a Fourier analysis, using no less than twenty tunable parameters, that has not been tested out-of-sample.
More importantly, I bring it up to show the appalling lack of peer review in the Copernicus Special Issue. There is no way that such a tuned, adjustable parameter model should have been published without being tested using out of sample data. The fact that the reviewers did not require that testing shows the abysmal level of peer review for the Special Issue.
w.
UPDATE: Greg Goodman in the comments points out that they appear to have done out-of-sample tests … but unfortunately, either they didn’t measure or they didn’t report any results of the tests, which means the method is still untested. At least where I come from, “test” in this sense means measure, compare, and report the results for the in-sample and the out-of-sample tests. Unless I missed it, nothing like that appears in the paper.
NOTE: If you disagree with me or anyone else, please QUOTE WHAT YOU DISAGREE WITH, and let us know exactly where you think it went off the rails.
NOTE: The equation I show above is the complete all-in-one equation. In the Salvador paper, it is not shown in that form, but as a set of equations that are composed of the overall equation, plus equations for each of the underlying composite parameters. The Mathematica code to convert his set of equations into the single equation shown in Figure 2 is here.
BONUS QUESTION: What the heck does the note in Figure 1 mean when it says “The R^2 for the data from 1749 to 2013 is 0.85 with radiocarbon dating in the correlation.”? Where is the radiocarbon dating? All I see is the NASA data and the model.
BONUS MISTAKE: In the abstract, not buried in the paper but in the abstract, the author makes the following astounding claim:
The model is a slowly changing chaotic system with patterns that are never repeated in exactly the same way.
Say what? His model is not chaotic in the slightest. It is totally deterministic, and will assuredly repeat in exactly the same way after some unknown period of time.
Sheesh … they claim this was edited and peer reviewed? The paper says:
Edited by: N.-A. Mörner
Reviewed by: H. Jelbring and one anonymous referee
Ah, well … as I said before, I’d have pulled the plug on the journal for scientific reasons, and that’s just one more example.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Greg A paper that has been cited many times is much less likely to be fundamentally flawed than a paper that has recieved few citations. The reason for this is that it has been scrutinised by more eyes and has been tested by its use as the basis for other work. The chance of a flaw going undetected decreases the more the work is used, but never dissapears entirely. Of course you can find examples where the metric doesn’t give a reliable indication, but that will be true of any metric, and is not a good reason for ignoring a metric that is amongst the best that scientists have found useful so far.
Now if you think that scientists are conformists, I have to say my experience is different, there is nothing we like more than to show something is incorrect (falsification is an important concept in science) or to demonstrate some result that our research field will find surprising (that is what makes high impact papers). It is often said that getting academics to coordinate is like herding cats – there is more than enough truth in that for the analogy to work rather well.
Are you sure the system is truly chaotic? How do you know? Are you claiming to know ALL the variables Willis? Chaos is not the appearance of disorder, chaos is the lack of knowledge to see the order. You perceive a chaotic solar output because your viewpoint is based on the “current” understanding of the sun’s dynamic processes. How do you know your understanding is really correct? You don’t and that’s where curve fitting comes into play to find a constant.
The idea that the first third in a series could be predicted by the data from second or third set is based on the notion that there are no UNKNOWN processes. The fact is we don’t know all the processes that affect the magnetic flux output of the sun over 50 years, 100 years or 1000 years for that matter, much less to what degree for each process. Fitting the curves treats the UNKNOWN variables as a constant, a fudge factor, that seems to work most of the time. And a constant doesn’t have to be a fixed number but it also could be an output from an equation. It may take a very long series on the order of a 1000 years just to find a close approximation of the constant. So this exercise by Salvador is valid as a predictive tool to see what his predictions are for the next two cycles and therefore is falsifiable per Popper’s requirement.
Willis, I think you are prematurely hyperventilating.
“So this exercise by Salvador is valid as a predictive tool to see what his predictions are for the next two cycles and therefore is falsifiable per Popper’s requirement.”
true, but there is little reason to think that his model will work well, given that it has performed badly in the out of sample testing that has already been performed. Also since two cycles is a rather small amount of data compared to that used in the existing out of sample testing, it will not be that easy to draw a solid statistical conclusion either way.
toms3d says:
“The conclusion of the author is simply.
”Fortunately because the changes to the base frequencies and phasing occur slowly in terms of human life spans, we can make forecasts that may be useful””
May be useful for predicting SSN if it could work, but not for the weather and climate. Look at solar cycle 8:
http://www.solen.info/solar/cycl8.html
and now look at CET 1833-1843, it’s dropping to LIA temperatures:
http://climexp.knmi.nl/data/tcet.dat
then compare SC 16 with CET:
http://www.solen.info/solar/cycl16.html
There is another whole side of planetary ordering of solar activity that is driving temperature deviations in the short term that can be remarkably unrelated to solar cycle size. That’s the really useful bit. There are though typically deeper and more frequent cold shots in the weakest cycles, but Salvador has not identified even the average period in which the weaker solar cycles reoccur.
Greg Goodman says:
January 22, 2014 at 6:48 am (Edit)
“Testing”, whether in or out of sample, requires measurement and comparison. Near as I can tell, they have done neither. But you say they have, and perhaps you are right … so where are the results of the out of sample tests?

As with the other papers in the series, they have waved their hands in the direction of seriously examining their claims. And I’m sure some people are impressed with the pretty pictures.
But if you say they’ve tested it out of sample, I’m sure that you can compare for us the R2 and p-value of the out-of-sample forecast with the R2 and p-value of the in-sample forecast that is shown in Figure 6.
I still say the reviewers did not do their job, that we have no results of out-of-sample testing, we have no code, and as such, the paper should not have been published as it stands.
w.
Thanks Salvador for your work a great read and insights, not quite sure I agree with all of it though, of course time will tell..
Obviously here in the Temple of Greatness twas not appreciated, and how the mighty Priests have enjoyed burning the heretic..
Sad Rude Poeple
Willis Eschenbach says:
January 22, 2014 at 3:43 am
“Thanks, Kasuha. Each of the individual sin and cosine functions that make up the equation repeats in a regular periodic manner. How can their sum not be periodic?
If what you are saying is true, seems like it would make a theoretically perfect random number generator, one that never, ever repeats… and I doubt that.
Seems to me that the sum/product/difference whatever of a finite number of infinitely repeating cyclical functions has to be a repeating cyclical function, and that that is a recurring and big problem in random number generators … but I’ve been wrong before …”
__________________________________________________________________
There’s no point in opinions or beliefs if we can check it:
http://www.wolframalpha.com/input/?i=periodicity+of+y+%3D+sin%28%28x+%2B+sin%28x%29%29%2F%281+%2B+sin%28x%29%29%29
Note that sin and cos functions are interchangeable if we have parameters in them which affect phase. Which is this case.
But it’s also not true that it would make perfect random number generator. It would be actually a very bad random number generator.
Georgie:
Please explain the purpose of your post at January 22, 2014 at 1:27 pm assuming it is other than to demonstrate you are a sad, rude person who cannot spell.
Richard
From the paper:
“Wilson also shows that the strength of the tidal force depends on the heliocentric latitude of Venus and the mean distance of Jupiter from the Sun, and that when these forces are weakest, solar minimums occur. This happens approximately every 165.5 yr. The frequency to produce a 165.5 yr beat with 22.14 yr is 19.528 yr.”
I corrected Ian on that figure a while back. At 14 Jupiter orbits (166.0648 sidereal years) there are 15 average length solar cycles of 11.071 years. And the beat of 166.0648 and 11.071 years is 11.8617 years, exactly one Jupiter orbit. The quoted 19.528yr period doesn’t exist anywhere as a “frequency”, it’s the axial period of his 165.5 and 22.14.
Willis Eschenbach says:
January 22, 2014 at 3:22 am
Stephen Wilde says:
January 22, 2014 at 2:44 am
Strange then that various commentators did actually predict the current solar quietness whilst the establishment was still predicting that cycle 24 would be another strong one.
Actually, most “establishment” predictions weren’t for a strong cycle, they were for a weak cycle. There is a list of them here, along with an interesting analysis of the various methods. See Table 1.
w.
Ah, no, not quite, I figure the folks at NASA are as establishment as it gets these days.
Here is NASA prediction of cycle 24
http://www.swpc.noaa.gov/SolarCycle/SC24/index.html
click on the link Solar Cycles 24 Consensus Prediction (PPT)
It was so far off as to be laughable. And they were going to massage the data once they were sure the cycle had started. (sound familiar?)
They could have done better with darts on a wall chart.
see: http://www.landscheidt.info/?q=node/50 for a more accurate way to estimate solar cycles.
the ‘peak’ about 48 to 50 months, and the ‘max’ pretty much wasn’t.
r
Willis the hilarious thing is this: If the IPCC published such a forecast and refused to put numbers on it, or didnt accurate predict magnitudes but got the direction( increasing or decreasing) right, people would scream.
Heres a good one. In 1988 Hansens model predicted increasing temperatures under all scenarios. Although he got the magnitude wrong he got the direction right. haha
#####
unknown period of time.
———————————————————————–
Thanks Willis I thought you might eventually use it.! Best Regards
Willis
This is an incredibly patronising post and filled with sneer. One could accuse you of “Kettle calling Pot black” here.
BTW the author does do your suggested “Bozo test”.
Furthermore, he qualifies what he meant by ‘chaotic’ at the end of the paper (he wasn’t referring to the model as such rather the process).
I passed on the opportunity.
Yet saw fit to pull him on it.
On the paper…
It ain’t sophisticated stuff but then neither is applying a discrete Fourier transform – but then he now has a parameterised function that (for testing) is a whole lot neater than trying to expand a signal in the frequency domain with extra terms beyond the sample window; and before you say it, there are a myriad of ways to deal with this but none of them as simple as they seem and none of them right only “best-for-case”. I don’t know enough about solar cycles to know if this all has any value but overall I found it interesting and showed exactly what he wanted to show.
Steven Mosher says:
January 22, 2014 at 2:21 pm
Willis the hilarious thing is this: If the IPCC published such a forecast and refused to put numbers on it, or didnt accurate predict magnitudes but got the direction( increasing or decreasing) right, people would scream.
Heres a good one. In 1988 Hansens model predicted increasing temperatures under all scenarios. Although he got the magnitude wrong he got the direction right. haha
#####
————————————————————————————–
No he didn’t it hasn’t warmed for going on 17 years ha ha!
Matthew R Marler says:
January 22, 2014 at 11:02 am
Lots of chaotic systems can appear periodic over a few to many cycles, an example being the near periodicity of Earth’s revolution about the sun
Yeah I think, but I could be wrong, Willis was talking about the model, so we may all be talking cross-purposes here. But if he is assuming that chaotic systems are not periodic as you suggest then he is wrong and you’re right. The signal can be non-stationary (that is periodicity is not constant and can change randomly throughout the chronology). Of course there is a matter of scale. That point was made at the end of the paper.
Greg Goodman says:
January 22, 2014 at 7:53 am
If you generate hundreds of “planetary constants” and then pick half a dozen at will in parameter fitting, it just like a quantised free parameter.
=========
Agreed, that lends itself to a form of cherry picking, which this type of analysis is prone to. Thus you must check to see if the results have predictive power.
However, one cannot simply dismiss the work until a predictive test has been done. Equally one cannot embrace the results without a predictive test, because of the large number of failed results in the past using similar approaches.
FrankK.
Last I checked its gone up since 1988. See how easy it is when u dont quantify things
Matthew R Marler says:
January 22, 2014 at 11:02 am
an example being the near periodicity of Earth’s revolution about the sun
================
orbits in a N body system are inherently unstable mathematically. It was not until the voyager photographs of Jupiter’s rings that we began to understand why the planets in the solar system haven’t long ago been thrown out of orbit or crashed into the sun.
As Kepler proposed, there is a resonance between the objects in to solar system, such that they are always adjusting their positions relative to each other, until over time they reach a “stable” arrangement, where their orbits oscillate within bounds, to minimize the energy of the entire system.
Some planets will move inwards, some outwards. Some will spin faster, some will slow down, until over time the system stabilizes at the lowest energy. If one item then tries to change its position from this pattern, it raises the energy of the system above the minimum. The other objects will shift slightly in response, shepherding the first object back into place.
We know this happens by looking at the rings. Our math says they should not be there. Reality says they are. We see this soft of behavior everywhere in nature. Somehow the system always seeks the lowest energy level and there it stabilizes. Ask any boat captain, why if they lose power, the boat will always turn broadside to the waves.
There is mention here of Fourier Analysis (FA) that is not specific enough. Most likely FA should not even apply here. FA is familiar in its two most basic forms as the Fourier Series (FS) [ for example, … 2 Cos(f) + 7 Sin(2f) +…] which applies to periodic functions (integer harmonics), and the Fourier Transform (FT -integral transforms) that apply to non-periodic functions. The famous Fast Fourier Transform (FFT) efficiently computes the Discrete Fourier Transform (DFT), and being a computation on discrete data, can APPROXIMATE either the FS or the FT when they can’t be solved analytically (the usual case). [ Incidentally, although composed of the sum of two periodic components, Cos[f] + Cos[sqrt(2)f], for example, is not periodic.] As another reference example, the highly regarded Akasofu linear trend + sinusoidal is neither a FS nor a FT, partly periodic, partly not, but is delightfully easy to envision as just an equation. In this post, we also have, most basically, just an equation.
It is not clear that Fourier Analysis should be applied to sunspot data (it seems otherwise), which is at best quasi-periodic. Indeed the equations Willis posts are phase-modulation or frequency-modulation equations and almost certainly can (in theory) be solved in terms of discrete frequencies (something like, but a lot more tedious than the usual Bessel-function sideband amplitudes). An FFT could be then applied to verifying these “sideband” calculations, if they were done, but it is far from certain any additional insight would result.
Before working on this however, one might want to consider what physics (if any) would suggest a modulation result. And, the projections of the model into the test gaps would need to be a lot better – like approaching the quality of the full fit.
tallbloke says:
January 22, 2014 at 5:54 am
OK, tallbloke, here you go. I’ve used 15 tunable parameters, plus your four fixed parameters.


That’s six less parameters than Salvador used. However, it is equally meaningless.
w.
Willis,
Look up the term superficial again please.
I just don’t quite understand what they teach kids these days… a few decades ago, when I was studying meteorology under one of the greats (‘Doc’ Saucier, at NC State), I had it hammered into my brain that models were quite useful, but you had to really, really respect boundary conditions. You can fit a curve to any set of data if you have enough degrees of freedom. The test was how well it behaved once you went outside the boundaries of the model. And that was just for predicting weather patterns a day or two in advance.
The Doc would have just laughed at anybody trying to predict average temperatures even a decade in the future.
For what it’s worth, tallbloke, the Fourier transform of the sunspot cycle has no long-period peaks. The energy is concentrated in various frequencies in the range from about 10-12 years, and there is little energy at periods longer than that. Salvador gets the short frequencies from the beat frequencies of much longer periods. However, those periods are not evident in the Fourier transform.
Regards,
w.
kuhnkat says:
January 22, 2014 at 6:09 pm
Is this a superficial analysis? Since a fitted model with 20 tunable parameters is a superficial model, I suppose any analysis of it has to be superficial.
w.
patrio says:
January 22, 2014 at 6:08 am
Not true. It specifically says that the 20 tunable parameters are just that, tuned. Nor does it “lay out the physics used to derive” the six decimal numbers. He picked several astronomical cycle lengths, and used those. A number of other people, including Scafetta, have done the same thing … but they picked different astronomical cycle lengths, or averages of two cycle lengths, or half cycle lengths … so where is the “physics” in picking astronomical cycles?
w.