Guest Post by Willis Eschenbach
In a comment on a recent post, I was pointed to a study making the following surprising claim:
Here, we analyze the stream flow of one of the largest rivers in the world, the Parana ́ in southeastern South America. For the last century, we find a strong correlation with the sunspot number, in multidecadal time scales, and with larger solar activity corresponding to larger stream flow. The correlation coefficient is r = 0.78, significant to a 99% level.
I’ve seen the Parana River … where I was, it was too thick to drink and too thin to plow. So this was interesting to me. Particularly interesting because in climate science a correlation of 0.78 combined with a 99% significance level (p-value of 0.01) would be a very strong result … in fact, to me that seemed like a very suspiciously strong result. After all, here is their raw data used for the comparison:
Figure 1. First figure in the Parana paper, showing the streamflow in the top panel, and sunspot number (SN) and total solar irradiance (TSI) in the lower two panels.
They are claiming a 0.78 correlation between the data in panel (a) and the data in panel (b) … I looked at Figure 1 and went “Say what?”. Call me crazy, but do you see any kind of strong 11-year cycle in the top panel? Because I sure don’t. In addition, when the long-term average of sunspots rises, I don’t see the streamflow rising. If there is a correlation between sunspots and streamflow, why doesn’t a several-decade period of increased sunspots lead to increased streamflow?
So how did they get the apparent correlation? Well, therein lies a tale … because Figure 2 shows what they ended up analyzing.
And wow, that sure looks like a very, very strong correlation … so how did they get there from such an unpromising start?
Well, first they took the actual data. Then, from the actual data they subtracted the “secular trends” (see dark smooth lines Figure 1). The effect of this first one of their processing steps is curious.
Look back at Figure 1. IF streamflow and sunspots were correlated, we’d expect them to move in parallel in the long term as well as the short term. But inconveniently for their theory … they don’t move in parallel. How to resolve it? Well, since the long-term secular trend data doesn’t support their hypothesis, their solution was to simply subtract that bad-mannered part out from the data.
I’m sure you can see the problems with that procedure. But we’ll let that go, the damage is fairly minor, and look at the next step, where the real destruction is done.
They say in Figure 2 that the sunspot data was “smoothed by an 11-yr running mean to smooth out the solar cycle”. However, it is apparent that the authors didn’t realize the effect of what they were doing. Calling what they did “smoothing” is a huge stretch. Figure 3 shows the residual sunspot anomaly (in blue) after removing the secular trend (as the authors did in the paper), along with the 11-year moving average of that exact same data (in red). Again as the authors did, I’ve normalized the two to allow for direct comparison:
Figure 3. Sunspot anomaly data (blue line), compared to the eleven-year centered moving average of the sunspot anomaly data (red line). Both datasets have been normalized to a mean of zero and a standard deviation of one.
Talk about a smoothing horror show, that has to be the poster child for bad smoothing. For starters, look at what the “smoothing” does to the sunspot data from 1975 to 2000 … instead of having two peaks at the tops of the two sunspot cycles (blue line, 1980 and 1991), the “smoothed” red line shows one large central peak, and two side lobes. Not only that, but the central low spot around 1986 has now been magically converted into a peak.
Now look at what the smoothing has done to the 1958 peak in sunspot numbers … it’s now twice as wide, and it has two peaks instead of one. Not only that, but the larger of the two peaks occurs where the sunspots actually bottomed out around 1954 … YIKES!
Finally, I knew this was going to be ugly, but I didn’t realize how ugly. The most surprising part to me is that their “smoothed” version of the data is actually negatively correlated to the data itself … astounding.
Part of the problem is the use of a running mean to smooth the data … a Very Bad Idea™ in itself. However, in this case it is exacerbated by the choice of the length of the average, 11 years. Sunspot cycles range from something like nine to thirteen years or so. As a result, cycles longer and shorter than the 11 year filter get averaged very differently. The net result is that we end up with some of the frequency data aliased into the average as amplitude data … resulting in the very different results from about 1945-60 versus the results 1975-2000.
Overall? I don’t care what they end up comparing to the red line … they are not comparing it to sunspots, not in any way, shape, or form. The blue line shows sunspots. The red line shows a mathematician’s nightmare.
How about the fact that they performed the same procedure on the Parana streamflow data? Does that make a difference? Figure 4 shows that result:
Figure 4. Parana streamflow anomaly data (blue line), compared to the eleven-year centered moving average of the streamflow anomaly data (red line). Both datasets have been normalized to a mean of zero and a standard deviation of 1.
As you can see, the damage done by the running mean is nowhere near as severe in this streamflow dataset as it was for the sunspots. Although there still are a lot of reversals, and turning peaks into valleys, at least the correlation is still positive. This is because the streamflow data does NOT contain the ± eleven-year cycles present in the sunspot data.
Conclusions? Well, my first conclusion is that as a result of doing what the authors did, comparing the red line in Figure 3 with the red line in Figure 4 says absolutely nothing about whether the Parana river streamflow is related to sunspots or not. The two red lines have very little to do with anything.
My second conclusion is, NEVER RUN STATISTICAL ANALYSES ON SMOOTHED DATA. I don’t care if you use gaussian smoothing or Fourier smoothing or boxcar smoothing or loess smoothing, if you want to do statistical analyses, you need to compare the datasets themselves, full stop. Statistically analyzing a smoothed dataset is a mug’s game. The problem is that as in this case, the smoothing can actually introduce totally false, spurious correlations. There’s an old post of mine on spurious correlation and Gaussian smoothing here for those interested in an example.
Please be clear that I’m not accusing the authors of any bad intent in this matter. To me, the problem is simply that they didn’t understand and were unaware of the effect of their “smoothing” on the data.
Finally, consider how many rivers there are in the world. You can be assured that people have looked at many of them to find a connection with sunspots. If this is the best evidence, it’s no evidence at all. And with that many rivers examined, a p-value of 0.05 is now far too generous. The more places you look, the more chance of finding a spurious correlation. This means that the more rivers you look at, the stronger your results must be to be statically significant … and we don’t yet have even passable results from the Parana data. So as to rivers and sunspots, the jury is still out.
How about for sea level and sunspots? Are they related? I can’t do better than to direct you to the 1985 study by Woodworth et al. entitled A world-wide search for the 11-yr solar cycle in mean sea-level records , whose abstract says:
Tide gauge records from throughout the world have been examined for evidence of the 11-yr solar cycle in mean sea-level (MSL). In Europe an amplitude of 10-15 mm is observed with a phase relative to the sunspot cycle similar to that expected as a response to forcing from previously reported solar cycles in sea-level air pressure and winds. At the highest European latitudes the MSL solar cycle is in antiphase to the sunspot cycle while at mid-latitudes it changes to being approximately in phase. Elsewhere in the world there is no convincing evidence for an 11-yr component in MSL records.
So … of the 28 geographical locations examined, only four show a statistically significant signal. Some places it’s acting the way that we’d expect … other places its not. Nowhere is it strong.
I haven’t bothered to go through their math, except for their significance calculations. They appear to be correct, including the adjustment to the required significance given the fact that they’ve looked in 28 places, which means that the significance threshold has to be adjusted. Good on them 1980s scientists, they did the numbers right back then.
However, and it is a very big however, as is common with such analyses from the 1980s, I see no sign that the results have been adjusted for autocorrelation. Given that both the sunspot data and the sea level data are highly autocorrelated, this can only move the results in the direction of less statistical significance … meaning, of course, that the four results that were significant are likely not to remain so once the results are adjusted for autocorrelation.
Is there a sunspot effect on the climate? Maybe so, maybe no … but given the number of hours people have spent looking for it, including myself and many, many others, if it is there, it’s likely very weak.
My best regards to all,
w.
NOTA BENE! If you disagree with something I said, please quote my exact words, and then tell me why you think I’m wrong. Telling me things like that my science sucks or baldly stating that I don’t understand the math doesn’t help me in the slightest. If I’m wrong I want to know it, but I have no use for claims like “Willis, you are so off-base in this case that you’re not even wrong.” Perhaps I am, but we’ll never know unless you specify exactly what I said that was wrong, and what was wrong with it.
So if you want me to treat you and your comments with respect, quote what you object to, and specify your objection. It’s the only way I can know what the heck you are talking about, and I’ve had it up to here with vague unsupported accusations of wrongdoing.
DATA: Digitized Parana streamflow data from the paper plus SIDC Sunspot data and all analyses for this post are on an Excel spreadsheet here. You’ll have to break the links, they are to my formula for Gaussian smoothing.
PS—Thanks to my undersea contacts for coming up with a copy of the thirty-year-old Woodworth study, and a hat tip to Dr. Holgate and Steve McIntyre at Climate Audit for the lead to the study. Dr. Holgate is well-known in sea level circles, here’s his comment on the sunspot question:
Many people have tried to link climate variations to sunspot cycles. My own feeling is that they both happen to exhibit variability on the same timescales without being causal. No one has yet shown a mechanism you understand. There is also no trend in the sunspot cycle so that can’t explain the overall rise in sea levels even if it could explain the variability. If someone can come up with a mechanism then I’d be open to that possibility but at present it doesn’t look likely to me.
If you’re interested in solar cycles and sea level, you might look at a paper written by my boss a few years back: Woodworth, P.L. “A world-wide search for the 11-yr solar cycle in mean sea-level records.” Geophysical Journal of the Royal Astronomical Society. 80(3) pp743-755
You’ll appreciate that this is a well-trodden path. My own feeling is that it’s not the determining factor in sea level rise, or even accounts for the trend, but there may be something in the variability. I’m just surprised that if there is, it hasn’t been clearly shown yet.
I can only agree …
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.

Wow! Talk about applying Finagle’s Infinitely Variable Constant to the raw data!
Willis, that’s quite a story! Thanks for chasing it down.
Cheers — Pete Tillman
—
The generation of random numbers is too important
to be left to chance.
Cool! Do you read this amazing blog WUWT where he shows correlation is irrelevant.
You are right, try, just try.
Great work! an analysis that you even you do not understand.
I guess on top of everything else there are precious few people or articles we can trust unless we have the knowledge to deconstruct the message ourselves ..or in this case have Willis.deconstruct it for us.
thanks again for shining the light on the vermin.
Another conclusion might be that the good old eyeball is an underrated way of spotting correlation, or lack of correlation. Fancy statistics might tease out correlations that are not obvious, but they seem to produce spurious artefacts all too often. And the fancier the statistics, the more sceptical we should be about claimed results. The history of Mannian and Steigian stats should tell us that.
Compare WJR Alexander et al. 2007
Linkages between solar activity, climate predictability and water resource development
JOURNAL OF THE SOUTH AFRICAN INSTITUTION OF CIVIL ENGINEERING Vol 49 No 2, June
2007, Pages 32–44, Paper 659
Alexander’s life long effort was to compile all hydrology related data for the Southern African region. He is making all the data available on disk to whoever requests it to the address given.
My wife worked as a biostatistician at a medical school. She was involved in lots of research, because the medical journals require that any article involving statistical analysis include a qualified statistician as a co-author. I wish the climate journals had the same requirement.
justsomeguy31167 says:
January 25, 2014 at 4:43 pm
Great work! an analysis that you even you do not understand.
==============
Which leads me to the conclusion, that you understand what was not understood ?
Care to enlighten us ?
Willis:
Why don’t you read my paper
Strong signature of the active Sun in 100 years of terrestrial insolation data
in Annalen der Physik 552,6, p.372 (2010)
http://onlinelibrary.wiley.com/doi/10.1002/andp.201000019/pdf
It is also discussed in the book of Vahrenholt & Luening
Sincerely,
Werner Weber
I have two grandfather clocks. The timing of the chimes is very strongly correlated, to over a 99% level. Also, one of the two always chimes first, so that one must be causing the other one to chime.
When the first one of them stops chiming so does the other, so this confirms the causation (whenever a power failure occurs).
This comment is no sillier than the Parana river study.
“Stream flow corrolated with sunspot activity”, Sorry, but I stopped being interested after that much nonsense.
I thought about seeing what and how they approached the Sun Spot numbers and even looked at this:
http://www.leif.org/research/CEAB-Cliver-et-al-2013.pdf
. . . so maybe Leif will comment.
Regardless of whether they used “International” or “Group”, the smoothing and processing seems to make it meaningless. I also thought of the William M. (Matt) Briggs series and maybe it will be useful to post those links:
#1 Do not calculate correlations after smoothing data __Note p=86
http://wmbriggs.com/blog/?p=86
#2 Do not smooth times series, you hockey puck! __Change p to 195
#3 Do NOT smooth time series before computing forecast skill __Change p to 735
Willis, it’s got to be a thankless job, documenting poor science.
What strikes me in your figure one is the high correlation between graphs (b) and (c), which appears to be between sunspots and solar insolation, unless I am missing an inversion somewhere. This seems at odds with Dr. Svalgaard’s assurances that sunspots reduce the output of solar energy.
“NEVER RUN STATISTICAL ANALYSES ON SMOOTHED DATA” – Mmmm, sometimes it’s appropriate.
I would disagree in one situation only – a 12-month running mean can average out the (fixed length!) seasonal cycle for the purposes of looking at longer term trends. Not 13 months (which some people for unknown reasons seem to prefer, but which introduces spurious beat frequencies due to the phase difference between 12 and 13 months), not 60 months, but 12.
Aside from that fairly minor quibble, excellent post. There are entirely too many papers that apply high, low, and bandpass filtering – and then claim extraordinary results from what’s left when their filtering has thrown out the dominant data, leaving only minor side frequencies that just happen to match their preconceptions.
It kills Willis to acknowledge that I posted the paper link in a comment here:
http://wattsupwiththat.com/2014/01/24/how-scientists-study-cycles/#more-102111
I found the paper elsewhere but also found a 2010 guest posting by David Archibald right here at WUWT:
http://wattsupwiththat.com/2010/07/22/solar-to-river-flow-and-lake-level-correlations/
Willis never mentioned that either.
I suggest you all read the comments about the original posting of the peer reviewed and published paper, not by Scafetta, rather, Mauas, P.J..D., A.P.Buccino and E.Flamenco, 2010, Long-term solar activity influences on South American rivers, Journal of Atmospheric and Solar-Terrestrial Physics on Space Climate, March 2010.
But strangely NOW 3.5 years later, I guess things have changed.
I blew the dust of that paper for a reason. I detect that Willis’ disproportionate and a little obsessive assault on the Scafetta was to do a little more than the content of the paper. Gosh know what else! Here I found a paper that was EXALTED in the comments here at WUWT 3.5 years ago and now Willis must again assert that there is no correlation (0.78 is no correlation I guess) and protest that now this paper is junk science.
You all be the judges.
I don’t know the authors of either paper. I seems to me that Scafetta is involved some political battle and now Willis must toss Mauas under a brand new bus, just so he can be consistent
Whatever the strange politics that are driving this odd situation, I can only speak for myself adn I say there, in both cases seem to some form of relationship that begs analysis.
Have a look here at the original WUWT post by David Archibald.
http://wattsupwiththat.com/2010/07/22/solar-to-river-flow-and-lake-level-correlations/
Read the highly contrasting assessments compared with those of this page.
I don’t know quite what to make of it.
Do the Physical Review Letters have the same exacting requirements for a peer review as the Copernicus Publishing? BTW, can we know who reviewed this gem?
David L. Hagen says:
January 25, 2014 at 5:14 pm
I took a look at his paper. I can’t understand his method. It appears that every alternate sunspot cycle has been recorded as a negative number, in order to kinda sorta convert it to a sine wave …
I gotta say, once someone starts doing calculations using the claim that in 1930 there were minus 63 sunspots or the like … my urban legend alarm starts to go off. What is a negative sunspot? There is indeed a 21=year “Hale cycle” of the solar geomagnetic activity, but if you are claiming a correlation with that, then you should use that and not some hacked-up version of sunspots.
And indeed, the annual sunspot data does not lend itself easily to flipping. Consider the following annual average sunspot counts:
1963 27.9
1964 10.2
1965 15.1
Now if you are going to “flip” the cycle starting in 1964, do you flip the 1964 data, or do you start the cycle by flipping the 1965 data?
As near as I can tell, he doesn’t answer that question in his paper, but whichever way it is done, it is bound, guaranteed, to change his results significantly. This is because he then accumulates the number of sunspots … so if the flipping points are all moved back or forwards by one year, what he identifies as critical points (which allegedly line up with changes in flow) move back or forwards by one year … and if they move forwards by one year, we’re left with the paradox of the effect happening before the cause.
Sorry, David, but my inability to be able to figure out either why or how he is flipping sunspots, along with the sensitivity of the results to totally arbitrary flipping decisions, along with the fact that is is using a crude and arbitrary measure like flipped sunspots instead of actually measuring the strength of the 21 year cycle … well, all of that combined makes my hair stand on end. I fear I will give Mr. Archibald’s work a miss.
w.
Now I get it!
In Climatology the term “Correlation does not imply causation”
is a Koan!
NIce to see that paper debunked so quickly.
Though, I don’t think it is helpful to make more general conclusions on the basis of such a poor paper. Please keep focussed on the best evidence and most influential papers.
I am also not impressed by Holgate’s statement, which essentailly says, he doesn’t believe his own data, because he cannot explain it, and because it can’t explain something else (a long term trend), and then, instead of analyzing, he comes up with an old paper from his boss…
James Strom.
I believe that the faculae make for the loss of sunspots.
Willis,
Hmmm, wow. I have to agree with you on this one.
Can I be so bold as to make a request? You spend an incredibly amount of time going over these papers. You are obviously very dedicated and patient. You have proved beyond a shadow of a doubt that there is a lot of bad science out there. Kudos. My request is, could you focus more on interesting papers that have merit? I can’t speak for others, but for me personally that would be much more enlightening and entertaining. As fun as it is to point at others and laugh together with a shared sense of intellectual superiority, it’d be more entertaining (to me anyways) to hear about actual discoveries. Or not, whatever. You’re the one that puts in the time, so of course, whatever you find the most rewarding. It’s just a suggestion.
Paul Westhaver says:
January 25, 2014 at 7:07 pm
I didn’t even consider acknowledging you, Paul. My bad, I didn’t realize you were that starved for approbation. Let me fix that right now.
Folks, Paul posted the link to the Parana paper, so if you see him, give him a big pat on the back from me. That’s PAUL WESTHAVER, if you were wondering how to spell it. He posted the link, and I can’t possibly tell you what a difference his posting that link has made in my life. If I were to pick one link-poster to be awarded the Kennedy Medal of Freedom, it would be PAUL WESTHAVER, no one’s even close.
Seriously, Paul, do you believe I thought about you enough to deliberately leave your name out of the post? I assure you, your name never crossed my mind, nor am I that petty.
Care to know what my real mental process was regarding the link?
You might note that not only did I not link to you, but contrary to my usual practice I didn’t even link to my own post. This was deliberate, because people on that post wanted to bust my chops over the Copernicus issue, and I wanted to move on. I’m tired of getting abused by handwaving vague fools simply because I’m calling for scientific transparency, and because I hold that if you break the pool rules you can’t complain when the lifeguard kicks you out.
So I didn’t mention the post by name nor discuss it in any fashion.
THAT was why I neither linked to your comment, nor to my own post. I wanted this post to be separate and unconnected.
Sorry you weren’t the first thing on my mind when I made the choice … but heck, you weren’t the last thing on my mind either. I made the decision on entirely different grounds than your imagination provided, grounds that I fear had nothing to do with you.
Regards, and in seriousness, thanks for pointing me to the paper.
Please note, however, that while you provided the pointer to the Parana paper, I provided the work, the insights, the analysis, the thoughts, the math, the graphs, and the writing regarding the Parana paper… and since you haven’t acknowledged me for doing that, I fear your complaint that I didn’t acknowledge your paltry contribution, well, that don’t impress me much.
w.
Ian Schumacher says:
January 25, 2014 at 7:39 pm
Thanks, Ian. Unfortunately, I’m a climate heretic. I hold that the temperature of the earth is NOT determined by the forcing. Instead, the temperature is held within narrow bounds (±3°C over the 20th Century) by the action of emergent climate phenomena including thunderstorms, El Nino, and the PDO.
As a result, there’s not a whole lot of work out there that actually relates to my work. As a result, I generally either work on my own scientific research, or I try to keep bad science under control. Not for reasons of “intellectual superiority”, but simply so that people don’t get led astray.
It’s a long slog …
Thanks again for the good thoughts,
w.
Manfred says:
January 25, 2014 at 7:27 pm
Quotations, Manfred, quotations let us know what you are referring to. Exactly what “general conclusion” of mine are you disagreeing with?
w.