Sunny Spots Along the Parana River

Guest Post by Willis Eschenbach

In a comment on a recent post, I was pointed to a study making the following surprising claim:

Here, we analyze the stream flow of one of the largest rivers in the world, the Parana ́ in southeastern South America. For the last century, we find a strong correlation with the sunspot number, in multidecadal time scales, and with larger solar activity corresponding to larger stream flow. The correlation coefficient is r = 0.78, significant to a 99% level.

I’ve seen the Parana River … where I was, it was too thick to drink and too thin to plow. So this was interesting to me. Particularly interesting because in climate science a correlation of 0.78 combined with a 99% significance level (p-value of 0.01) would be a very strong result … in fact, to me that seemed like a very suspiciously strong result. After all, here is their raw data used for the comparison:

parana streamflow fig 1Figure 1. First figure in the Parana paper, showing the streamflow in the top panel, and sunspot number (SN) and total solar irradiance (TSI) in the lower two panels.

They are claiming a 0.78 correlation between the data in panel (a) and the data in panel (b) … I looked at Figure 1 and went “Say what?”. Call me crazy, but do you see any kind of strong 11-year cycle in the top panel? Because I sure don’t. In addition, when the long-term average of sunspots rises, I don’t see the streamflow rising. If there is a correlation between sunspots and streamflow, why doesn’t a several-decade period of increased sunspots lead to increased streamflow?

So how did they get the apparent correlation? Well, therein lies a tale … because Figure 2 shows what they ended up analyzing.

parana streamflow fig 2

And wow, that sure looks like a very, very strong correlation … so how did they get there from such an unpromising start?

Well, first they took the actual data. Then, from the actual data they subtracted the “secular trends” (see dark smooth lines Figure 1). The effect of this first one of their processing steps is curious.

Look back at Figure 1. IF streamflow and sunspots were correlated, we’d expect them to move in parallel in the long term as well as the short term. But inconveniently for their theory … they don’t move in parallel. How to resolve it? Well, since the long-term secular trend data doesn’t support their hypothesis, their solution was to simply subtract that bad-mannered part out from the data.

I’m sure you can see the problems with that procedure. But we’ll let that go, the damage is fairly minor, and look at the next step, where the real destruction is done.

They say in Figure 2 that the sunspot data was “smoothed by an 11-yr running mean to smooth out the solar cycle”. However, it is apparent that the authors didn’t realize the effect of what they were doing. Calling what they did “smoothing” is a huge stretch. Figure 3 shows the residual sunspot anomaly (in blue) after removing the secular trend (as the authors did in the paper), along with the 11-year moving average of that exact same data (in red). Again as the authors did, I’ve normalized the two to allow for direct comparison:

normalized sunspot anomaly and 11 yr running meanFigure 3. Sunspot anomaly data (blue line), compared to the eleven-year centered moving average of the sunspot anomaly data (red line). Both datasets have been normalized to a mean of zero and a standard deviation of one.

Talk about a smoothing horror show, that has to be the poster child for bad smoothing. For starters, look at what the “smoothing” does to the sunspot data from 1975 to 2000 … instead of having two peaks at the tops of the two sunspot cycles (blue line, 1980 and 1991), the “smoothed” red line shows one large central peak, and two side lobes. Not only that, but the central low spot around 1986 has now been magically converted into a peak.

Now look at what the smoothing has done to the 1958 peak in sunspot numbers … it’s now twice as wide, and it has two peaks instead of one. Not only that, but the larger of the two peaks occurs where the sunspots actually bottomed out around 1954 … YIKES!

Finally, I knew this was going to be ugly, but I didn’t realize how ugly. The most surprising part to me is that their “smoothed” version of the data is actually negatively correlated to the data itself … astounding.

Part of the problem is the use of a running mean to smooth the data … a Very Bad Idea™ in itself. However, in this case it is exacerbated by the choice of the length of the average, 11 years. Sunspot cycles range from something like nine to thirteen years or so. As a result, cycles longer and shorter than the 11 year filter get averaged very differently. The net result is that we end up with some of the frequency data aliased into the average as amplitude data … resulting in the very different results from about 1945-60 versus the results 1975-2000.

Overall? I don’t care what they end up comparing to the red line … they are not comparing it to sunspots, not in any way, shape, or form. The blue line shows sunspots. The red line shows a mathematician’s nightmare.

How about the fact that they performed the same procedure on the Parana streamflow data? Does that make a difference? Figure 4 shows that result:

normalized parana anomaly and 11 yr running meanFigure 4. Parana streamflow anomaly data (blue line), compared to the eleven-year centered moving average of the streamflow anomaly data (red line). Both datasets have been normalized to a mean of zero and a standard deviation of 1.

As you can see, the damage done by the running mean is nowhere near as severe in this streamflow dataset as it was for the sunspots. Although there still are a lot of reversals, and turning peaks into valleys, at least the correlation is still positive. This is because the streamflow data does NOT contain the ± eleven-year cycles present in the sunspot data.

Conclusions? Well, my first conclusion is that as a result of doing what the authors did, comparing the red line in Figure 3 with the red line in Figure 4 says absolutely nothing about whether the Parana river streamflow is related to sunspots or not. The two red lines have very little to do with anything.

My second conclusion is, NEVER RUN STATISTICAL ANALYSES ON SMOOTHED DATA. I don’t care if you use gaussian smoothing or Fourier smoothing or boxcar smoothing or loess smoothing, if you want to do statistical analyses, you need to compare the datasets themselves, full stop. Statistically analyzing a smoothed dataset is a mug’s game. The problem is that as in this case, the smoothing can actually introduce totally false, spurious correlations. There’s an old post of mine on spurious correlation and Gaussian smoothing here for those interested in an example.

Please be clear that I’m not accusing the authors of any bad intent in this matter. To me, the problem is simply that they didn’t understand and were unaware of the effect of their “smoothing” on the data.

Finally, consider how many rivers there are in the world. You can be assured that people have looked at many of them to find a connection with sunspots. If this is the best evidence, it’s no evidence at all. And with that many rivers examined, a p-value of 0.05 is now far too generous. The more places you look, the more chance of finding a spurious correlation. This means that the more rivers you look at, the stronger your results must be to be statically significant … and we don’t yet have even passable results from the Parana data. So as to rivers and sunspots, the jury is still out.

How about for sea level and sunspots? Are they related? I can’t do better than to direct you to the 1985 study by Woodworth et al. entitled A world-wide search for the 11-yr solar cycle in mean sea-level records , whose abstract says:

Tide gauge records from throughout the world have been examined for evidence of the 11-yr solar cycle in mean sea-level (MSL). In Europe an amplitude of 10-15 mm is observed with a phase relative to the sunspot cycle similar to that expected as a response to forcing from previously reported solar cycles in sea-level air pressure and winds. At the highest European latitudes the MSL solar cycle is in antiphase to the sunspot cycle while at mid-latitudes it changes to being approximately in phase. Elsewhere in the world there is no convincing evidence for an 11-yr component in MSL records.

So … of the 28 geographical locations examined, only four show a statistically significant signal. Some places it’s acting the way that we’d expect … other places its not. Nowhere is it strong.

I haven’t bothered to go through their math, except for their significance calculations. They appear to be correct, including the adjustment to the required significance given the fact that they’ve looked in 28 places, which means that the significance threshold has to be adjusted. Good on them 1980s scientists, they did the numbers right back then.

However, and it is a very big however, as is common with such analyses from the 1980s, I see no sign that the results have been adjusted for autocorrelation. Given that both the sunspot data and the sea level data are highly autocorrelated, this can only move the results in the direction of less statistical significance … meaning, of course, that the four results that were significant are likely not to remain so once the results are adjusted for autocorrelation.

Is there a sunspot effect on the climate? Maybe so, maybe no … but given the number of hours people have spent looking for it, including myself and many, many others, if it is there, it’s likely very weak.

My best regards to all,

w.

NOTA BENE! If you disagree with something I said, please quote my exact words, and then tell me why you think I’m wrong. Telling me things like that my science sucks or baldly stating that I don’t understand the math doesn’t help me in the slightest. If I’m wrong I want to know it, but I have no use for claims like “Willis, you are so off-base in this case that you’re not even wrong.” Perhaps I am, but we’ll never know unless you specify exactly what I said that was wrong, and what was wrong with it.

So if you want me to treat you and your comments with respect, quote what you object to, and specify your objection. It’s the only way I can know what the heck you are talking about, and I’ve had it up to here with vague unsupported accusations of wrongdoing.

DATA: Digitized Parana streamflow data from the paper plus SIDC Sunspot data and all analyses for this post are on an Excel spreadsheet here. You’ll have to break the links, they are to my formula for Gaussian smoothing.

PS—Thanks to my undersea contacts for coming up with a copy of the thirty-year-old Woodworth study, and a hat tip to Dr. Holgate and Steve McIntyre at Climate Audit for the lead to the study. Dr. Holgate is well-known in sea level circles, here’s his comment on the sunspot question:

Many people have tried to link climate variations to sunspot cycles. My own feeling is that they both happen to exhibit variability on the same timescales without being causal. No one has yet shown a mechanism you understand. There is also no trend in the sunspot cycle so that can’t explain the overall rise in sea levels even if it could explain the variability. If someone can come up with a mechanism then I’d be open to that possibility but at present it doesn’t look likely to me.

If you’re interested in solar cycles and sea level, you might look at a paper written by my boss a few years back: Woodworth, P.L. “A world-wide search for the 11-yr solar cycle in mean sea-level records.” Geophysical Journal of the Royal Astronomical Society. 80(3) pp743-755

You’ll appreciate that this is a well-trodden path. My own feeling is that it’s not the determining factor in sea level rise, or even accounts for the trend, but there may be something in the variability. I’m just surprised that if there is, it hasn’t been clearly shown yet.

I can only agree … 

About these ads

163 thoughts on “Sunny Spots Along the Parana River

  1. Willis, that’s quite a story! Thanks for chasing it down.

    Cheers — Pete Tillman

    The generation of random numbers is too important
    to be left to chance.

  2. Cool! Do you read this amazing blog WUWT where he shows correlation is irrelevant.

    You are right, try, just try.

  3. I guess on top of everything else there are precious few people or articles we can trust unless we have the knowledge to deconstruct the message ourselves ..or in this case have Willis.deconstruct it for us.

    thanks again for shining the light on the vermin.

  4. Another conclusion might be that the good old eyeball is an underrated way of spotting correlation, or lack of correlation. Fancy statistics might tease out correlations that are not obvious, but they seem to produce spurious artefacts all too often. And the fancier the statistics, the more sceptical we should be about claimed results. The history of Mannian and Steigian stats should tell us that.

  5. Compare WJR Alexander et al. 2007
    Linkages between solar activity, climate predictability and water resource development
    JOURNAL OF THE SOUTH AFRICAN INSTITUTION OF CIVIL ENGINEERING Vol 49 No 2, June
    2007, Pages 32–44, Paper 659

    This study is based on the numerical analysis of the properties of routinely observed
    hydrometeorological data which in South Africa alone is collected at a rate of more than
    half a million station days per year, with some records approaching 100 continuous years
    in length. The analysis of this data demonstrates an unequivocal synchronous linkage
    between these processes in South Africa and elsewhere, and solar activity. This confirms
    observations and reports by others in many countries during the past 150 years.
    It is also shown with a high degree of assurance that there is a synchronous linkage
    between the statistically significant, 21-year periodicity
    in these processes and the
    acceleration and deceleration of the sun as it moves through galactic space. Despite a
    diligent search, no evidence could be found of trends in the data that could be attributed
    to human activities.
    It is essential that this information be accommodated in water resource development and
    operation procedures in the years ahead.

    Alexander’s life long effort was to compile all hydrology related data for the Southern African region. He is making all the data available on disk to whoever requests it to the address given.

  6. My wife worked as a biostatistician at a medical school. She was involved in lots of research, because the medical journals require that any article involving statistical analysis include a qualified statistician as a co-author. I wish the climate journals had the same requirement.

  7. justsomeguy31167 says:

    January 25, 2014 at 4:43 pm

    Great work! an analysis that you even you do not understand.
    ==============
    Which leads me to the conclusion, that you understand what was not understood ?
    Care to enlighten us ?

  8. I have two grandfather clocks. The timing of the chimes is very strongly correlated, to over a 99% level. Also, one of the two always chimes first, so that one must be causing the other one to chime.
    When the first one of them stops chiming so does the other, so this confirms the causation (whenever a power failure occurs).

    This comment is no sillier than the Parana river study.

  9. I thought about seeing what and how they approached the Sun Spot numbers and even looked at this:

    http://www.leif.org/research/CEAB-Cliver-et-al-2013.pdf

    . . . so maybe Leif will comment.
    Regardless of whether they used “International” or “Group”, the smoothing and processing seems to make it meaningless. I also thought of the William M. (Matt) Briggs series and maybe it will be useful to post those links:
    #1 Do not calculate correlations after smoothing data __Note p=86

    http://wmbriggs.com/blog/?p=86

    #2 Do not smooth times series, you hockey puck! __Change p to 195

    #3 Do NOT smooth time series before computing forecast skill __Change p to 735

  10. Willis, it’s got to be a thankless job, documenting poor science.

    What strikes me in your figure one is the high correlation between graphs (b) and (c), which appears to be between sunspots and solar insolation, unless I am missing an inversion somewhere. This seems at odds with Dr. Svalgaard’s assurances that sunspots reduce the output of solar energy.

  11. “NEVER RUN STATISTICAL ANALYSES ON SMOOTHED DATA” – Mmmm, sometimes it’s appropriate.

    I would disagree in one situation only – a 12-month running mean can average out the (fixed length!) seasonal cycle for the purposes of looking at longer term trends. Not 13 months (which some people for unknown reasons seem to prefer, but which introduces spurious beat frequencies due to the phase difference between 12 and 13 months), not 60 months, but 12.

    Aside from that fairly minor quibble, excellent post. There are entirely too many papers that apply high, low, and bandpass filtering – and then claim extraordinary results from what’s left when their filtering has thrown out the dominant data, leaving only minor side frequencies that just happen to match their preconceptions.

  12. It kills Willis to acknowledge that I posted the paper link in a comment here:

    http://wattsupwiththat.com/2014/01/24/how-scientists-study-cycles/#more-102111

    I found the paper elsewhere but also found a 2010 guest posting by David Archibald right here at WUWT:

    http://wattsupwiththat.com/2010/07/22/solar-to-river-flow-and-lake-level-correlations/

    Willis never mentioned that either.

    I suggest you all read the comments about the original posting of the peer reviewed and published paper, not by Scafetta, rather, Mauas, P.J..D., A.P.Buccino and E.Flamenco, 2010, Long-term solar activity influences on South American rivers, Journal of Atmospheric and Solar-Terrestrial Physics on Space Climate, March 2010.

    But strangely NOW 3.5 years later, I guess things have changed.

    I blew the dust of that paper for a reason. I detect that Willis’ disproportionate and a little obsessive assault on the Scafetta was to do a little more than the content of the paper. Gosh know what else! Here I found a paper that was EXALTED in the comments here at WUWT 3.5 years ago and now Willis must again assert that there is no correlation (0.78 is no correlation I guess) and protest that now this paper is junk science.

    You all be the judges.

    I don’t know the authors of either paper. I seems to me that Scafetta is involved some political battle and now Willis must toss Mauas under a brand new bus, just so he can be consistent

    Whatever the strange politics that are driving this odd situation, I can only speak for myself adn I say there, in both cases seem to some form of relationship that begs analysis.

    Have a look here at the original WUWT post by David Archibald.

    http://wattsupwiththat.com/2010/07/22/solar-to-river-flow-and-lake-level-correlations/

    Read the highly contrasting assessments compared with those of this page.

    I don’t know quite what to make of it.

  13. Do the Physical Review Letters have the same exacting requirements for a peer review as the Copernicus Publishing? BTW, can we know who reviewed this gem?

  14. David L. Hagen says:
    January 25, 2014 at 5:14 pm

    Compare WJR Alexander et al. 2007
    Linkages between solar activity, climate predictability and water resource development
    JOURNAL OF THE SOUTH AFRICAN INSTITUTION OF CIVIL ENGINEERING Vol 49 No 2, June
    2007, Pages 32–44, Paper 659

    Alexander’s life long effort was to compile all hydrology related data for the Southern African region. He is making all the data available on disk to whoever requests it to the address given.

    I took a look at his paper. I can’t understand his method. It appears that every alternate sunspot cycle has been recorded as a negative number, in order to kinda sorta convert it to a sine wave …

    I gotta say, once someone starts doing calculations using the claim that in 1930 there were minus 63 sunspots or the like … my urban legend alarm starts to go off. What is a negative sunspot? There is indeed a 21=year “Hale cycle” of the solar geomagnetic activity, but if you are claiming a correlation with that, then you should use that and not some hacked-up version of sunspots.

    And indeed, the annual sunspot data does not lend itself easily to flipping. Consider the following annual average sunspot counts:

    1963 27.9
    1964 10.2
    1965 15.1

    Now if you are going to “flip” the cycle starting in 1964, do you flip the 1964 data, or do you start the cycle by flipping the 1965 data?

    As near as I can tell, he doesn’t answer that question in his paper, but whichever way it is done, it is bound, guaranteed, to change his results significantly. This is because he then accumulates the number of sunspots … so if the flipping points are all moved back or forwards by one year, what he identifies as critical points (which allegedly line up with changes in flow) move back or forwards by one year … and if they move forwards by one year, we’re left with the paradox of the effect happening before the cause.

    Sorry, David, but my inability to be able to figure out either why or how he is flipping sunspots, along with the sensitivity of the results to totally arbitrary flipping decisions, along with the fact that is is using a crude and arbitrary measure like flipped sunspots instead of actually measuring the strength of the 21 year cycle … well, all of that combined makes my hair stand on end. I fear I will give Mr. Archibald’s work a miss.

    w.

  15. NIce to see that paper debunked so quickly.

    Though, I don’t think it is helpful to make more general conclusions on the basis of such a poor paper. Please keep focussed on the best evidence and most influential papers.

    I am also not impressed by Holgate’s statement, which essentailly says, he doesn’t believe his own data, because he cannot explain it, and because it can’t explain something else (a long term trend), and then, instead of analyzing, he comes up with an old paper from his boss…

  16. James Strom.
    I believe that the faculae make for the loss of sunspots.

    Variations in TSI are due to a balance between decreases caused by sunspots and increases caused by bright areas called faculae which surround sunspots. On the whole, the effects of the faculae tend to beat out those of the sunspots.

  17. Willis,

    Hmmm, wow. I have to agree with you on this one.

    Can I be so bold as to make a request? You spend an incredibly amount of time going over these papers. You are obviously very dedicated and patient. You have proved beyond a shadow of a doubt that there is a lot of bad science out there. Kudos. My request is, could you focus more on interesting papers that have merit? I can’t speak for others, but for me personally that would be much more enlightening and entertaining. As fun as it is to point at others and laugh together with a shared sense of intellectual superiority, it’d be more entertaining (to me anyways) to hear about actual discoveries. Or not, whatever. You’re the one that puts in the time, so of course, whatever you find the most rewarding. It’s just a suggestion.

  18. Paul Westhaver says:
    January 25, 2014 at 7:07 pm

    It kills Willis to acknowledge that I posted the paper link in a comment here:

    http://wattsupwiththat.com/2014/01/24/how-scientists-study-cycles/#more-102111

    I found the paper elsewhere but also found a 2010 guest posting by David Archibald right here at WUWT:

    I didn’t even consider acknowledging you, Paul. My bad, I didn’t realize you were that starved for approbation. Let me fix that right now.

    Folks, Paul posted the link to the Parana paper, so if you see him, give him a big pat on the back from me. That’s PAUL WESTHAVER, if you were wondering how to spell it. He posted the link, and I can’t possibly tell you what a difference his posting that link has made in my life. If I were to pick one link-poster to be awarded the Kennedy Medal of Freedom, it would be PAUL WESTHAVER, no one’s even close.

    Seriously, Paul, do you believe I thought about you enough to deliberately leave your name out of the post? I assure you, your name never crossed my mind, nor am I that petty.

    Care to know what my real mental process was regarding the link?

    You might note that not only did I not link to you, but contrary to my usual practice I didn’t even link to my own post. This was deliberate, because people on that post wanted to bust my chops over the Copernicus issue, and I wanted to move on. I’m tired of getting abused by handwaving vague fools simply because I’m calling for scientific transparency, and because I hold that if you break the pool rules you can’t complain when the lifeguard kicks you out.

    So I didn’t mention the post by name nor discuss it in any fashion.

    THAT was why I neither linked to your comment, nor to my own post. I wanted this post to be separate and unconnected.

    Sorry you weren’t the first thing on my mind when I made the choice … but heck, you weren’t the last thing on my mind either. I made the decision on entirely different grounds than your imagination provided, grounds that I fear had nothing to do with you.

    Regards, and in seriousness, thanks for pointing me to the paper.

    Please note, however, that while you provided the pointer to the Parana paper, I provided the work, the insights, the analysis, the thoughts, the math, the graphs, and the writing regarding the Parana paper… and since you haven’t acknowledged me for doing that, I fear your complaint that I didn’t acknowledge your paltry contribution, well, that don’t impress me much.

    w.

  19. Ian Schumacher says:
    January 25, 2014 at 7:39 pm

    Willis,

    Hmmm, wow. I have to agree with you on this one.

    Can I be so bold as to make a request? You spend an incredibly amount of time going over these papers. You are obviously very dedicated and patient. You have proved beyond a shadow of a doubt that there is a lot of bad science out there. Kudos. My request is, could you focus more on interesting papers that have merit? I can’t speak for others, but for me personally that would be much more enlightening and entertaining. As fun as it is to point at others and laugh together with a shared sense of intellectual superiority, it’d be more entertaining (to me anyways) to hear about actual discoveries. Or not, whatever. You’re the one that puts in the time, so of course, whatever you find the most rewarding. It’s just a suggestion.

    Thanks, Ian. Unfortunately, I’m a climate heretic. I hold that the temperature of the earth is NOT determined by the forcing. Instead, the temperature is held within narrow bounds (±3°C over the 20th Century) by the action of emergent climate phenomena including thunderstorms, El Nino, and the PDO.

    As a result, there’s not a whole lot of work out there that actually relates to my work. As a result, I generally either work on my own scientific research, or I try to keep bad science under control. Not for reasons of “intellectual superiority”, but simply so that people don’t get led astray.

    It’s a long slog …

    Thanks again for the good thoughts,

    w.

  20. Manfred says:
    January 25, 2014 at 7:27 pm

    NIce to see that paper debunked so quickly.

    Though, I don’t think it is helpful to make more general conclusions on the basis of such a poor paper. Please keep focussed on the best evidence and most influential papers.

    Quotations, Manfred, quotations let us know what you are referring to. Exactly what “general conclusion” of mine are you disagreeing with?

    w.

  21. They miss what the data truly shows with all of the mathematical gymnastics. I see very high correlation with the flood cycle of the Pacific NW. An example is the 1996/97 high water, which is the second highest on the graph. That was a semi biblical flood event in No California. These flood events occurred on the ascent after a solar minimum. The year 1964/65, a big year for the Parana. In the Pacific NW, a huge rain event that stretches from SF/Bay Area through to British Columbia. The floods occur shortly after the solar minimum and on the ascent side.The year 1955/56 shows the Parana River at a high level, and moving higher over time. In the Pacific NW, there is a massive flood, although it did not impact as large of an area as 1964/65. The 1955/56 floods occur on the middle of the ascent after the solar minimum and before the max. In 1975/76, the Parana has a high flow. In the Pacific NW there is a drought in No California, and the climate shifts at this point. This is the first year since the 20s where the 9 year flood cycle breaks. It seems to now be between 11 and 12 years between high water events. The years 1975/76 are just prior to a solar minimum. In 1984/85, the Parana River is flowing strong, and the years 1983/84 mark the highest peak recorded in the chart. The Pacific NW had strong rains in areas of the coast and the new cycle of over 11 years per flood is in place, with the following flood cycle in 1996/97. There is some high water in the Pacific NW around 2007/08. It would be interesting to see an updated graph to see if the Parana had a spike at that time. That is on the way down to the solar minimum. The correlations stretch back into the 20s from what I can see and mesh with what I know.

    So once again straightforward observations and historical data would have served them well, in seeing deeper into these charts.

  22. Willis, I have enjoyed reading your stuff over the years and have rooted for you against the warmista. But, Mate, do us a favour, please, less of the sanctimonious BS … you’re better than that; and you’re coming across like a bit of a Mikey Mann.

  23. Willis: As I recall the MOTHER of all “correllation errors” has to be the wolve and moose population on Isle Royal, Lake Superior. “Closed system” obviously, and fairly good tracking of the number of wolves and moose (fly overs, good spotters, consistent methods) from the 30’s through the ’80’s AND, paper after paper after PAPER showed this WONDERFUL correlation showing the wolves controlling the number of moose, etc, yada, and so on… BUT some HERETIC like yourself, took a GOOD statistical look, and said, “PURE NONSENSE”, force a re-evaluation. Eventually a plant with a 7 year cycle of abundance and retreat, provided a vital nutrient, which controlled the fertility of the meese (haha, I know MOOSE!) …and the wolve population would more or less correlate with the amount of moose to loose…and all the preceding scholarly work became WORTHLESS. Again, the “prima facia” example of year of “academics” fooling themselves. Delicious. All I have to say is:
    KEEP THROWING THOSE MONKEY WRENCHES IN THE WORKS!

  24. Were any of the co-authors statisticians? I googled them and found nothing suggesting any of them were. Professional journals ought to require a statistician as a co-author of these papers that conflate statistics with science. But I suspect there aren’t enough statisticians to go around.

  25. The correlation coefficient is r = 0.78, significant to a 99% level.

    Thanks Willis. That mediocre correlation and unbelievable significance triggered my BS meter too. But unlike you, I didn’t have the faintest idea what to do about it. Well done.

  26. goldminor says:
    January 25, 2014 at 8:12 pm

    They miss what the data truly shows with all of the mathematical gymnastics. I see very high correlation with the flood cycle of the Pacific NW. An example is the 1996/97 high water, which is the second highest on the graph. That was a semi biblical flood event in No California. These flood events occurred on the ascent after a solar minimum. The year 1964/65, a big year for the Parana. In the Pacific NW, a huge rain event that stretches from SF/Bay Area through to British Columbia. The floods occur shortly after the solar minimum and on the ascent side.The year 1955/56 shows the Parana River at a high level, and moving higher over time. In the Pacific NW, there is a massive flood, although it did not impact as large of an area as 1964/65. The 1955/56 floods occur on the middle of the ascent after the solar minimum and before the max. In 1975/76, the Parana has a high flow. In the Pacific NW there is a drought in No California, and the climate shifts at this point. This is the first year since the 20s where the 9 year flood cycle breaks. It seems to now be between 11 and 12 years between high water events. The years 1975/76 are just prior to a solar minimum. In 1984/85, the Parana River is flowing strong, and the years 1983/84 mark the highest peak recorded in the chart. The Pacific NW had strong rains in areas of the coast and the new cycle of over 11 years per flood is in place, with the following flood cycle in 1996/97. There is some high water in the Pacific NW around 2007/08. It would be interesting to see an updated graph to see if the Parana had a spike at that time. That is on the way down to the solar minimum. The correlations stretch back into the 20s from what I can see and mesh with what I know.

    So once again straightforward observations and historical data would have served them well, in seeing deeper into these charts.

    Goldminor, while that is an interesting theory, so far it is nothing but anecdote. Do you have a record of the rainfall in the “Pacific Northwest”, whatever that might mean to you? I’m always interested in real world data and teasing out relationships, but a list of floods doesn’t help. Humans are great at seeing connections that aren’t there … faces in clouds and constellations in the stars.

    So if you’d provide a link to the data I’m happy to take a look.

    w.

  27. I always thought filters throw away information. Don’t see how they can add information.

    If you do enough low pass filtering, you end up with a single value.

    Then anything correlates with anything else.

    It also seems to me that if you actually have two phenomena that really do have a physical cause and effect relationship, then the one that is the effect will generally be delayed from the one that is the cause (changes). Doing a correlation for various values of time offset, should enable that physical delay to be determined.

    Funny thing Willis, is those Piranhas didn’t seem to do any such analysis.

    If CO2 lags behind temperature by 800 years, How would a correlation at zero delay reveal any connection ?

  28. Paul Westhaver says:
    January 25, 2014 at 8:58 pm

    Wow…

    Paul, you falsely accused me in a quite unpleasant manner of deliberately not acknowledging you for pointing out the Parana paper, an accusation which was laughably far from the truth.

    What did you expect in response? That it would make me feel all warm and fuzzy towards you? That I’d ask you to go steady?

    You seem surprised that when you bite me, I bite back … in future, you might save time and avoid further shocks to your belief system by simply assuming that when I’m falsely accused, that’s what I’ll do.

    Wow indeed … in any case, as I’d like to avoid this unpleasantness next time ’round, a simple request for acknowledgement would have a much different outcome. I like acknowledging people, I do it all the time, and I try to do it as a matter of course for those who inspire my posts. As I indicated in my reply, there were other considerations in this case, and as a result I never even thought about it for this paper.

    But deliberately doing not acknowledging you for your contribution? Not my style and never has been.

    w.

  29. Willis. You mentioned some information on how smoothing introduced spurious correlations. Here is the good place to start.
    Loynes, R. M. 2005. Slutzky–Yule Effect. Encyclopedia of Biostatistics.

    Abstract
    Smoothing a time series by forming a moving average is a commonly employed approach. In this article, some of the problems that arise are discussed, in particular, the introduction of correlations even when the observations in the original series were independent.

    Your work with independent or random series supported what was said in this paper.
    Take care.

  30. george e. smith says:
    January 25, 2014 at 9:56 pm

    I always thought filters throw away information. Don’t see how they can add information.

    A poorly designed filter, or a filter that is poorly chosen for a given task, can indeed add spurious information to a signal, and this one is a great example. In this filter the frequency information is being aliased into the amplitude information. Other than getting aliased information from the frequency, I suspect you are right that the filter can’t add information … but it can move it from one location to another and generally screw with the signal.

    w.

  31. Leonard Lane says:
    January 25, 2014 at 10:19 pm

    Willis. You mentioned some information on how smoothing introduced spurious correlations. Here is the good place to start.
    Loynes, R. M. 2005. Slutzky–Yule Effect. Encyclopedia of Biostatistics.

    Abstract
    Smoothing a time series by forming a moving average is a commonly employed approach. In this article, some of the problems that arise are discussed, in particular, the introduction of correlations even when the observations in the original series were independent.

    Your work with independent or random series supported what was said in this paper.
    Take care.

    Thanks, Leonard. I knew the effect had a name, Steven McIntyre referred to it but I couldn’t remember. Appreciated.

    w.

  32. Were any of the co-authors statisticians? I googled them and found nothing suggesting any of them were. Professional journals ought to require a statistician as a co-author of these papers that conflate statistics with science. But I suspect there aren’t enough statisticians to go around.

  33. Streetcred says:
    January 25, 2014 at 8:38 pm

    Willis, I have enjoyed reading your stuff over the years and have rooted for you against the warmista. But, Mate, do us a favour, please, less of the sanctimonious BS … you’re better than that; and you’re coming across like a bit of a Mikey Mann.

    Well said that man, Willis gets all twitchy with the “If you disagree with something I said, please quote my exact words, and then tell me why you think I’m wrong.”
    I’m sure that most here understand that game, but some here are trying to point out to you that the ‘attack dog’ writing style really does you no favours.
    I’m in the same boat as Streetcred above, the lack of humility is outstanding. I have no argument with the way you want to ‘do your science’ but you could show a little less aggression in your ‘tone’ when writing your ‘science’ essays.
    Maybe I should just stick to reading your well crafted life experiences and avoid the ‘shove it down your throat’ science articles.
    I’m sorry Willis but the schizophrenic writing style is not so nice to digest.
    yours in honest disappointment

    Joe B

  34. @Willis…I always thought of the Pacific Northwest as SF/BayArea to southern British Columbia. Large storms that cross through this boundary then go on to affect states well to the east. I first heard about the 9 year flood cycle in 1971, when I moved up to the Klamath River. I knew of two of the 9 year floods from personal experience, the 1955/66 and 1964/65. In the summer of 1965 I took a Greyhound bus up to Seattle to stay with cousins for the summer. That was the summer without sun in Seattle. The bus ride took 38 hours from SF to Seattle.It was supposed to be an 18 hour route. The devastation of the flood stretched all the way to Seattle.

    I just looked at a revised ssn chart that was produced by Dr Svalgaard. It has a much higher resolution than most, and I can see that the connection between ssn and high water events is not as clear. The connection with the Pacific NW high water events and the Parana River is right on, though. I saved a link for San Francisco rainfall, 1849 to present. The Parana and SF share some years 1996/97, 1982/83, 1972/73, 1941/42, 1930/31, 1911/12 with peak rain years, but there are some SF years that are moderate to low against the Parana highs. Although, I notice that for many of the Pacific coastal heavy rain events, San Francisco had a below average rainfall. The No California/Oregon/ Washington big rains in the 40s through the 60s run counter to SF rainfall, during that period, then it changes in the 70s and synchronizes for the 70s, 80s, and 90s. Never noticed that before….http://www2.ucar.edu/sites/default/files/news/2013/rainfall_chart_orig.jpg

  35. I just had a chance to have a good look at that figure 1 a)
    which represents the flow rate.
    I suspect that the curved line in that figure 1a) is a best fit of a polynomial of the third order? Or is it a running mean of some sort? What period?
    Either way, looking at that curved line I conclude:
    The minimum flowrate of the Parana river was in 1953 or 1954, average..
    The maximum flowrate appears to be around 1990, average.
    There is no data before 1905, but it seems the curve came down from a maximum flow rate at around 1895.

    Now look here:
    There are good records of the flooding of the Nile, for example here:

    http://www.cyclesresearchinstitute.org/cycles-astronomy/arnold_theory_order.pdf

    to quote from the above paper:
    “A Weather Cycle as observed in the Nile Flood cycle, Max rain followed by Min rain, appears discernible with maximums at 1750, 1860, 1950 and minimums at 1670, 1800, 1900 and a minimum at 1990 predicted.
    The range in meters between a plentiful flood and a drought flood seems minor in the numbers but real in consequence….

    end quote

    According to my table for maxima,

    http://blogs.24.com/henryp/2013/02/21/henrys-pool-tables-on-global-warmingcooling/

    I calculate the date where the sun decided to take a nap (that is just a figure of speech, in fact it is probably a “wake-up”), as being around 1995, and not 1990 as William Arnold predicted.
    This is looking at energy-in. I think earth reached its maximum output (means) a few years later, around 1998/1999.

    Anyway, either way, (a few years error is fine!), look again at my best sine wave plot for my data,

    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    now see:

    1900 minimum flooding – end of the warming
    1950 maximum flooding – end of cooling
    1995 minimum flooding – end of warming.
    predicted 2035-2040 – maximum flooding – end of cooling.

    There is a clear and pertinent correlation with the best fit sine wave that I proposed for the observed current drop in global maximum temperatures, both for the Parana and Nile rivers.

    What causes the current decrease of these rivers’ flow is this is fairly simple: As the temperature differential between the poles and equator grows larger due to the cooling from the top, very likely something will also change on earth. Predictably, there would be a small (?) shift of cloud formation and precipitation, more towards the equator, on average. At the equator insolation is 684 W/m2 whereas on average it is 342 W/m2. So, if there are more clouds in and around the equator, this will amplify the cooling effect due to less direct natural insolation of earth (clouds deflect a lot of radiation). Furthermore, in a cooling world there is more likely less moisture in the air, but even assuming equal amounts of water vapour available in the air, a lesser amount of clouds and precipitation will be available for spreading to higher latitudes. So, a natural consequence of global cooling is that at the higher latitudes it will become cooler and/or drier.
    In a cooling world such as ours now,

    http://www.woodfortrees.org/plot/hadcrut4gl/from:1987/to:2014/plot/hadcrut4gl/from:2002/to:2014/trend/plot/hadcrut3gl/from:1987/to:2014/plot/hadcrut3gl/from:2002/to:2014/trend/plot/rss/from:1987/to:2014/plot/rss/from:2002/to:2014/trend/plot/hadsst2gl/from:1987/to:2014/plot/hadsst2gl/from:2002/to:2014/trend/plot/hadcrut4gl/from:1987/to:2002/trend/plot/hadcrut3gl/from:1987/to:2002/trend/plot/hadsst2gl/from:1987/to:2002/trend/plot/rss/from:1987/to:2002/trend

    it will simply become wetter at the lower latitudes…..

    A clever farmer living at high latitude, who already experienced drought situations, would realize that it is not going to get better for the next three decades. He would now pack up his bags and move to a place of lower latitiude.

  36. goldminor, thanks for the chart of SF rainfall. I digitized it and checked it against the sunspot record … correlation 0.036, p-value 0.64, no relationship …

    w.

  37. goldminor says: January 25, 2014 at 8:12 pm
    – – –
    I have lived just north of the Pacific NW, in the Canadian Pacific SW for nearly 50 years. My knowledge of historic flooding is that the amount of flooding is related to both the quantity of the previous winters snow pack and the timing of the hot weather in spring. About 100 years ago there was a great flood that today would have wiped out thousands of homes including mine. Since that time dikes have been built, hundreds and hundreds of miles of them of which I have ridden my bike on a large number of. Just a few years ago there was a great scare of spring flooding which resulted in millions of additional dollars being spent to increase the height of the dikes. The flooding turned out to be a dud, with levels even below average or at least nothing to write home about. I did get a lovely extra height for my bike rides for better sight seeing, at taxpayer expense. But I suppose its insurance for any future potential flooding. A couple of years ago, the Fraser river and Pitt river were at high levels, so much so that a tower holding power lines for crossing the Fraser was knocked out of commission and there were fish on my bike path, at the location where it passes below the railroad tracks.
    Perhaps not a lot of science in my comment, just anecdotal observations. However there is more to flooding in this area than just the amount of rain.

  38. Excellent Willis. This is perfect example of the kind of garbage that can result from these ubiquetous running mean “smoothers”. In fact I’ve never seen such whole scale inversion. It would have made a ideal example for my article on running mean distortion on Judith’s Climate Etc.

    http://judithcurry.com/2013/11/22/data-corruption-by-running-mean-smoothers/

    I’m glad the issue is getting some coverage.This may be a bit of an odd-ball paper but this kind of filtering is de rigeur in climate science. There seems to be barely a paper that does not use it somewhere and of course the processing of the “gold standard” hadSST dataset uses it over adjacent grid cells in an iterative loop to determine their background climatology.

    The other main application is our friend the monthly average which, is mathematically equivalent to using a monthly running before resampling monthly intervals, whereas correct processing would require a 2 month anti-alias filter.

    Most of the current data processing being done climate science is doing more to ensure that they do not identify any natural periodic forcing than anything else. But that probably plays to the “consensus” view that it’s all stochasic ‘internal’ variation plus CO2.

    Bias confimation at work.

    I used SSN in my article as an example of the effect of the monthly running mean. It is used in determining the date of the “peak” of each solar cycle. In the current cycle it finds the peak to be in the month that has the lowerest SSN for the last 2.5 years !!

  39. Greg Goodman says:
    January 26, 2014 at 1:19 am

    Excellent Willis. This is perfect example of the kind of garbage that can result from these ubiquetous running mean “smoothers”. In fact I’ve never seen such whole scale inversion. It would have made a ideal example for my article on running mean distortion on Judith’s Climate Etc.

    Thanks, Greg. It has been your urging and your statements about running mean smoothers on various of my threads that gave me the insight regarding the problem with the Parana data … I was still shocked, though, to find out that the resulting “smooth” actually has a negative correlation with the data. Astounding.

    Best regards,

    w.

  40. HenryP says:
    January 25, 2014 at 11:41 pm

    Can somebody direct me to the data of that river’s flow? Henry

    The river flow data is in the Excel spreadsheet linked to at the bottom of the head post, Henry.

    w.

  41. Willis said, “I can’t understand his method.”

    Neither do I. But if you look at the Normalized Sunspot Anomaly and 11-year Running Mean if you invert the blue line Sunspot Anomaly it close to me. Wills why don’t you ask them first?

  42. Willis, the mathematics of this paper as you have outlined it, reminds me of my youth.

    I am in the electronics business and as a school kid I was into radio. When I was around 18 years old that interest shifted to music. I wonder what? Then I grew up and went back to my first interest, radio, and qualified as an electronic engineer in 1981.

    So what has that to do with this paper?

    Well, one of the things that I noticed in the field of audio and HiFi was that all music amplifiers and loudspeakers were rated as X-many Watts peak, and Y-many Watts RMS!

    So what is 100 W RMS? Mathematically I can calculate the RMS (root-mean-square) of the audio power, but what does it mean? Nothing really. It bears no direct relationship to anything you can physically observe. (Note – I said “direct”).

    It is merely marketing gobledy-gook. On the other hand I can calculate the RMS of the voltage waveform or the current waveform going to the loudspeaker and that HAS got meaning. It is directly related to the average audio power (in watts).

    This paper is similar to that it is presenting marketing nonsense. What is the physical meaning of de-trended data? What is the meaning of smoothing it? Without good physical reason to apply a mathematical process to data, the result is meaningless gobledy-gook.

  43. Ox AO says:
    January 26, 2014 at 1:31 am

    Willis said,

    “I can’t understand his method.”

    Neither do I. But if you look at the Normalized Sunspot Anomaly and 11-year Running Mean if you invert the blue line Sunspot Anomaly it [looks?]close to me. Wills why don’t you ask them first?

    Why not ask them? Mostly ’cause life is too short and it’s too far down on my priority list, Ox. Also, I’m philosophically opposed to the idea of negative numbers of sunspots. Plus I continue to work on my own research, plus write papers like this one. However, I encourage you to write them if you wish.

    All the best,

    w.

  44. “My second conclusion is, NEVER RUN STATISTICAL ANALYSES ON SMOOTHED DATA. I don’t care if you use gaussian smoothing or Fourier smoothing or boxcar smoothing or loess smoothing, if you want to do statistical analyses, you need to compare the datasets themselves, full stop. Statistically analyzing a smoothed dataset is a mug’s game. The problem is that as in this case, the smoothing can actually introduce totally false, spurious correlations.”

    While there is a lot of truth in that as a basic warning, it starts to go wrong with the word NEVER.

    I always like the popular, self-contradictory, axiom : you should never generalise.

    This comes back to my gripe about “smoothers”. If you just want the data to _look_ smoother for a graph, this should have nothing to do with the data processing and your statement is correct.

    However, if you have, for example, a strong annual cycle and you want to see whether there is a small decadal scale correlation between two datasets you are not going to get the answer if you don’t filter out the annual cycle.

    Like most “you should NEVER” statements this one is incorrect if takes literally.

    The part about increasing the correlation is also correct since all weighted average filters, even good ones, combine successive data points and reduce the degrees of freedom in the data. If this is ignored in calculating the what level of correlation is significant the answer will be very misleading.

    The point is to recognise that there is not just one fixed value that shows “good” correlation but that the value is determined by the number of “degrees of freedom” in the data. Often this can be taken as the number of data point (before filtering) and needs to be reduced appropriately if filtering is used.

    Relevant comment by rgbatduke :

    http://wattsupwiththat.com/2014/01/21/sunspots-and-sea-level/#comment-1548206

    I would suggest a better ‘never’ statement would be :

    You should NEVER use a correlation coefficient to conclude significance without calculating what value is significant for the data in question.

  45. I am not sure if this is relevant, but in mid-1982 in Johannesburg, myself and three friends started up a water drilling drilling company on the basis that a period of droughts was imminent.

    Perhaps more by luck than good judgement, the rains failed in the summer of 1982/83 and the next two summers were exceptionally dry. The company prospered and grew like Topsy.

    We had examined the rainfall records for the Highveldt area around Johannesburg over the previous 100 years and noticed there was a very biblical cycle of dry and wet years. 11 years of good rains (usually with a couple of poor years) followed by 11 years of poor rains (usually with a couple of good years) . I have no idea if this 11 year cycle has continued to today.

    Perhaps this 11 year cycle was a coincidence, anyhow I cannot see if it could have had anything to do with sunspots.

  46. “Greg:

    I think NEVER in this case is a good word. If you want to ‘see’ the relationship within some data then by all means filter it until it appears to show what you want.

    But if you are running code on your data to ‘find’ relationships within the data, why would you want to risk modifying the data by filtering?

    All filtering, averaging or any other process that changes your original data to something else, has by design, changed your data.

    Smoothing is only for humans to observe.

    All of the exquisite data variations captured by some tedious or expensive process deserve to be used, ‘in the raw’ by your analysis programs.

    I am sure R or C or perl do not mind if your data values wiggle around a bit to much, preventing them looking very nice.

    Smoothing for humans, raw code for programs.

  47. For the last 100 years which is all the data I have, where I farm, good rains come with the low of the sunspot cycle. iI haven’ t calculated any coefficients, because in the rest of the cycle there is no particular eyeball relationship, but in forty years of farming at least I know what will happen every decade at some point. No idea f this holds elsewhere but it works here. I dont need scientific approval, papers, peer review, or Willis’ approval. There are more things in heaven and earth Horatio,etc.

  48. Steve Richards says: “why would you want to risk modifying the data by filtering?”

    How about reading my comment before trying to reply to it ?
    ;)

  49. No link between solar activity and river flow? NASA found one:

    http://www.nasa.gov/vision/earth/lookingatearth/nilef-20070319.html

    Alexander Ruzmaikin and Joan Feynman of NASA’s Jet Propulsion Laboratory, Pasadena, Calif., together with Dr. Yuk Yung of the California Institute of Technology, Pasadena, Calif., have analyzed Egyptian records of annual Nile water levels collected between 622 and 1470 A.D. at Rawdah Island in Cairo. These records were then compared to another well-documented human record from the same time period: observations of the number of auroras reported per decade in the Northern Hemisphere.
    [..]
    The researchers found some clear links between the sun’s activity and climate variations. The Nile water levels and aurora records had two somewhat regularly occurring variations in common – one with a period of about 88 years and the second with a period of about 200 years.
    The researchers said the findings have climate implications that extend far beyond the Nile River basin.
    [..]
    So what causes these cyclical links between solar variability and the Nile? The authors suggest that variations in the sun’s ultraviolet energy cause adjustments in a climate pattern called the Northern Annular Mode, which affects climate in the atmosphere of the Northern Hemisphere during the winter. At sea level, this mode becomes the North Atlantic Oscillation, a large-scale seesaw in atmospheric mass that affects how air circulates over the Atlantic Ocean. During periods of high solar activity, the North Atlantic Oscillation’s influence extends to the Indian Ocean. These adjustments may affect the distribution of air temperatures, which subsequently influence air circulation and rainfall at the Nile River’s sources in eastern equatorial Africa. When solar activity is high, conditions are drier, and when it is low, conditions are wetter.
    Study findings were recently published in the Journal of Geophysical Research.
    “.

  50. I don’t think the Parana river correlation is the best scientific work on the sun-earth connection, but there are several works in the same direction for the NH:
    The Mississippi catch area:

    http://ks.water.usgs.gov/pubs/reports/paclim99.html

    http://ks.water.usgs.gov/solar-irradiance

    The rivers in Portugal:

    http://onlinelibrary.wiley.com/doi/10.1029/2005GL023787/abstract

    Similar reports from the river Po (Italy) and Nile (Egypt) were available, but the links don’t work anymore…

    The background may be that an active sun increases mainly in the UV range, which increases ozone in the lower stratosphere, increasing its temperature and the temperature difference equator-poles, pushing the jet streams towards the poles, including wind and rain patterns. That makes that several rivers/countries will have more rain at high sunspot levels, but I suppose that it is a mix of several influences: PDO/NAO, ENSO, solar activity,…

  51. In my view (possibly minority of one) climate events do not react to ‘a forcing’ represented by sunspot number, it is more likely to be the geomagnetic disturbances, which happen to be in time and intensity considerably different to the sunspot series:

    Further problem is that CME’s, the cause of the geomagnetic disturbances, have a magnetic polarity so has the earth’s field, and this is not taken into account.
    The Dst index does this, but it is derived from a network of near-equatorial geomagnetic observatories, while the geomagnetic storms’ effect (I think) propagates atmospherically from higher to the lower latitudes.

  52. Douglas J. Keenan says:
    January 26, 2014 at 2:18 am

    A nice related post is given by Matt Briggs: “Do NOT smooth time series before computing forecast skill“.

    This is slight improvement on his “hockey puck” article but still, having said that filtering increases correlation he regards this as a black and white issue and thus concludes you should “never” assess the correlation of filtered data.

    He claims to have some kind of expertise in statistics, yet seems to miss the whole and and the opportunity to suggest how to adjust what level of correlation is significant.

    There is also the question of the degree of auto-correlation present in the data to start with.

    I use a simplistic adjustment in that if use a 12mo low pass filter I reduce the number of data points by a factor of twelve.

    AR(1) can be removed with a first difference, this reduces degrees of freedom by half.

    Can you help on that?

  53. What’s missing is the error on the flow data. Without it you can’t judge whether the variation is real or simply a noisy signal.

    The second thing is that there can be a time shift between the two signals. Shift the upper signal by, say, 6 years and see how that looks. The Parana river is fed from an basin the size of Europe which may have a response time of several years to changes affecting the vegetation such as varying far UV insolation.

  54. Douglas Keenan, to clarify if I filter monthly data with 12mo low-pass I divide the d.o.f by 12 , if I do a diff , I divide d.o.f. by 2.

    Can you provide any better suggestions for correct calculation of significance in correlation coefficient where processing has reduced the degrees of freedom in the data?

    Thx.

  55. Ed Zuiderwijk says:
    The second thing is that there can be a time shift between the two signals.
    ===

    Indeed, a lag-correlation plot would be more appropriate.

  56. I blame Bill Gates and Steve Jobs for all this. By providing a machine with a button on it that says “Do the statistics” they let a host of mathematical monkeys into science. We now have a few who understand what they are doing, like Willis, McIntyre, Briggs, and then a host of people who push buttons on computers and print the result without a clue as to the meaning of it all. Sad really. We are expected to compare these shadow men like Mann with the greats of the past who worked with chalk boards and pencils, slide rules and in China, the abacus. Take away their all singing and all dancing computers and they would not be able to sweep the streets without a sign saying “brush here”.
    I, of course, would be found brushing from the other direction, but then I have never claimed to be that most ephemeral and illusory of creatures, “a scientist”.

  57. Very good point Ivor.

    I agree that Excel functions where you can fit a “trend” or get the correlation at the click of a button invite people to do things that they do not understand.

    People conclude that they have “the” trend , like there always is a meaningful linear trend to be had irrespective of whether fitting a linear model to the data makes any sense at all.

    In this respect Phil Jones get cred from me for not knowing how to fit a trend in Excel. If we could ban “trend” fitting from the discussion we’d get a lot further, a lot quicker.

  58. At first sight figure 3 is wrong.

    Take the year ~1988. The original data for the years +- 5 years appear to be negative or weackly positive. Yet according to your calculations the result is strongly positive.

    This is obviously wrong and I do not believe that the correlation between the original and low-pass filtered is 0.1. Why? Because the correlation function is multiplied by the square of filtered spectrum and integrated with respect to frequency.

    I would say that you have dropped a real clanger here.

    COME ON WILLIS, SHOW US YOUR DATA AND SHOW US YOUR CODE!

  59. RC. I would be good to check but don’t forget the resulting “smoothed” data is normalised by S.D. , it will be scaled up considerably. (factor of 10 at a guess).

    That does not detract from W.’s criticism, that this is effectively what this paper is doing.

    If he’s on the hunt for crap data processing in climate science he’s going to have a post per day for the rest of the century, to keep up entertained.

  60. I agree with Willis’ observations here.

    You would be surprised how many climate science papers do something similar to this. The vast majority in fact, beyond all kinds of other made-up math and made-up charting methods.

    For solar, the numbers need to be changed to energy-based measures. Sunspots mean nothing. I could be convinced that some type of accumulating/declining balance W/m2 measure describes what the Sun actually does but it is not a smoothed 11 year cycle. There needs to be a physical explanation for the Sun causing cycles, not just a random correlation. For the Milankovitch cycles, at least here we have a physical explanation that is logical, but the numbers in this case don’t line up either. In other words, correlation, especially with a smoothed series of data which has no physical cause basis is just a meaningless exercise.

    I can’t find a solar cycle signal in the temperature numbers. One particular dataset using one methodology (amongst a dozen other possibilities) showed a bare hint of a solar cycle – all others have nothing. That means, it must be very, very small at least in the time-period we have real energy measures, since 1978 that is. The Maunder Minimum might have had much lower energy levels that is currently believed but we don’t really know.

  61. Hi Willis,
    You comment: I can’t understand his method. It appears that every alternate sunspot cycle has been recorded as a negative number, in order to kinda sorta convert it to a sine wave

    I don’t think much is wrong with that, providing the reasons are justified, or at least motioned, and they were not in that paper.
    I often do it myself, and if I write narrative (which often I do not!) I mention NASA’s statement as an authoritative back up.
    Here, I will outline my view for necessity of sign-ing the incoming solar magnetic impact:
    Sun has two ‘oscillating’ hemispheres (often posited two dynamos which occasionally get out of phase), and if you plot sunspot cycles for each hemisphere separately (including magnetic polarity) you would get two Sine waves more or less in a counter phase.
    As far as the Earth and rest of the solar system see it (exception are the official sunspot number observers), magnetic fields of two hemispheres do not mix, they are separated by the solar current sheet (I believe Dr. Svalgaard is one of discoverers).
    The Earth, Jupiter, Saturn etc, i.e. all magnetic sensitive entities, see most of the time either one or the other, while the through transition is often very short and some time geomagnetically strong.
    There is another point worth noting as far as the Earth events are considered:
    Solar coronal mass ejections, CMEs (according to the NASA’s observations and an official statement) in the even-numbered solar cycles tend to hit Earth more often with a leading edge that is magnetized north. Such CMEs open a breach and load the magnetosphere with plasma starting a geomagnetic storm.
    Hence, in my opinion (I acknowledge always disputed and characterised as wrong by Dr.S) it is necessary to take account of the incoming magnetic polarity. There is also question of a possible inclination difference of the drifting Earth’s magnetic axes to the current sheet, between the even and odd numbered cycles, but that is far more complex factor.
    Following the official stance that the sun’s magnetic polarity doesn’t mater, science will always be able to refute possibility of the sun’s influence on the climate natural variability.
    But if one takes opposite view on polarity as I outlined above, than it is a child’s play to get, let me say clearly, not numerical correlation if that is important to you, but up/down step by step follow in time , as shown here

    http://www.vukcevic.talktalk.net/Ap-NHT.htm

    between the most trusted Ap index and the Earth’s N. Hemisphere temperature’s natural variability (narrative may follow some day).

  62. I have to withdraw my last comment.

    I have downloaded the NOAA sunspot data and formed a running mean and Figure 3 is essentially correct. I apologise.

    However, having now read the paper very carefully, I do not think that this is what they are presenting. There is an extraordinary correlation between the sunspot data and the river flow. I do not think that they imply that they they filtered out the eleven year cycle in the sunspot data and they certainly have not done this. What they are saying is that there is a correlation bewteen sunspots and riiver flow. It makes no sense to remove the principle component of the sunspot variability when correlating it with something that they believe is correlated with sunspots. The data is figure 2 clearly has not been filtered.

    What I believe they have done is smooth the de-trending function with an 11 year filter to remove a spurious de-trending component. I agree that this isn’t clear although one thing that should be borne in mind is that all three authors are Argentinian and their native language is, I assume, not English. I have seen a vast number of misunderstandings stemming from this problem.

  63. When I had to analyze the data, I would decompose the streamflow data empirically using Empirical Mode Decomposition (EMD), followed by a crosscorrelation analysis and subsequency statistical tests.
    For curiosity, I did it. Here are the results:

    (1) Raw data:

    http://bayimg.com/KahNHaAFm

    (2) First I normalized the data (z-score), decomposed the streamflow data using ensemble EMD (EEDM) (std = 0.1, N = 100) [1], found that the third component contains the oscillation possibly related to the sunspot data, performed a cross-correlation analysis, giving me the information that there is a lag of 4 years – leading to:

    http://bayimg.com/KaHniaAFm

    http://bayimg.com/KAhnoaafm (cross-correlation function)

    (3) Finally, I inverted the streamflow signal, added a 4 year lag and got this:

    http://bayimg.com/lAHnaAAfm

    (4) There seems to be a correlation. I tested it with a t-test, giving me a p-value of 0.1121, i.e. the correlation is not statistically significant. However, there seems to be a phase-synchronizaton between these signals. The next step would be to use data that with shorter sampling intervals (e.g. daily values).

    [1] Huang et al. (2009). Ensemble Empirical Mode Decomposition: A Noise-Assisted Data Analysis Method, Adv. Adapt. Data Anal. 01, 1, http://www.worldscientific.com/doi/abs/10.1142/S1793536909000047

  64. Ivor

    You nailed it.

    “By providing a machine with a button on it that says “Do the statistics” they let a host of mathematical monkeys into science. We now have a few who understand what they are doing, like Willis, McIntyre, Briggs, and then a host of people who push buttons on computers and print the result without a clue as to the meaning of it all.”

    It is not only the field of science which is suffering. I spent thirty years in building systems consulting engineering and currently work for developer. When I was hired, my job description was “Ride herd on the consultants.” I have had several young engineers reporting to me, who at times have stated conclusions so blisteringly stupid that I have had to go back over the RFPs to be certain of what I had asked them to do. They simply don’t know when they get something wrong because they rely entirely on software. It has happened so frequently we laughingly gave it a name; The sorcerer’s apprentice syndrome. Apparently there is also a button which says, “Design The HVAC systems”.

    They do however, show no lack of confidence in their results.

  65. Henry@Erwin
    nice work there
    the lag could oscillate between 4 and 6 years, depending on what happens on earth and with earth’s core a la Vukcevic and what happens in the atmosphere
    it would therefore be a much better idea to try and correlate the flowrate of the river with maximum temperatures, or even with the speed of (global) maximum temperatures, as determined by me,

    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    which is like a real gauge of the amount of energy coming in. The sunspots are an indirect measure.

    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    henry@all
    I just need to know in what town the flowmeter is situated and which way does the river flow? Is it north to south or south to north?

  66. henry@all
    I just need to know in what town the flowmeter is situated and which way does the river flow? Is it north to south or south to north?

  67. let me guess from the data
    the flowmeter is around -40 latitude
    and the river flows opposite the direction of Nile, north to south
    Am I right?

  68. I am not a Climate Scientist, but a humble Civil Engineer. I lived in Kenya for fourteen years or so, and we had to react when the level of Lake Victoria rose by six feet over only two rainy seasons – given that the lake is the size of Ireland in a catchment the size of UK, that is pretty amazing. And the lake didn’t return to previous levels (which had been stable since European entry around 1900) for about 40 years. There’s a (UK) Institution of Civil Engineers Paper 09-00041,(P J Mason) published May 2010 which looks at correlation with sunspots for the Lake, the Nile, and African rivers in general.

  69. “Is there a sunspot effect on the climate? Maybe so, maybe no … but given the number of hours people have spent looking for it, including myself and many, many others, if it is there, it’s likely very weak.”

    I agree these short term cycles and variations would be difficult to filter out of the noise of the interaction of all the systems involved in the transfer of energy from the Sun to the Earth. My interest remains in the present possible ‘new Dalton Minimum’ and the ‘solar maximum’ we had in sun spot activity within the range of the modern warming we’ve experienced. We only need a couple hundred more years of data of the quality we’ve collected for 30 years to figure it out.

    In the mean time we could use modelling to elicit magical properties from Sun Spots, just as we have for CO2, or more transparent statistical methods and manipulations as described in this article and used by many in climate ‘science’ … just for entertainment and debate :D now that we’ve changed the Northern Jet Stream into the evil ‘Arctic Vortex’ and seems to be flopping to the East when it pushes South … over people softened by the last 30 years … who knows what wondrous causal relationships we can come up with … like Autism and vaccinations …

    I wonder if what you describe in this article means the bit in ‘Wonders of the Solar System’ needs to be changed ;)

  70. http://www.vukcevic.talktalk.net/Ap-NHT.htm

    between the most trusted Ap index and the Earth’s N. Hemisphere temperature’s natural variability (narrative may follow some day).

    some day not too far away I hope. This looks impressive but without knowing how you got there it has no value. (No matter how many times you link it on WUWT).

  71. itsonlysteam says: January 26, 2014 at 10:56 am
    I wonder if what you describe in this article means the bit in ‘Wonders of the Solar System’ needs to be changed ;)

    Yes Prof Brian Cox is going to be well miffed to be contradicted.

  72. I think on balance, this paper is not refuted judging by the comments so far and judging by addition related works brought to the fore.

    Now compare this paper to the Scafetta paper.

    Compare this published paper which has been around in parts for 5 years at least and published for 3.5 years and even published and discussed here at WUWT. In the previous discussion in 2010 there was la little picking at the detail and some praise.

    What is the difference between Now and then?

    The difference is the unspoken intent under the over-the-top criticism of the Scafetta paper and Copernicus.

    I saw something in the data in the Scafetta. Nothing specific but something worth making the comment that I saw something.

    That yielded such inappropriate and disturbing level of abusive response that I detected that I stepped into something else, something “inside baseball.”

    In order to resolve my perception and test the integrity of the analysis in specific relation to Scafetta and Copernicus, I dug up the old Parana paper, and inserted it into the discussion of the Copernicus thread and the “science of cycles” thread since it was relevant to both and shared context.

    Lets face it, solar spot number verses sea level and solar spot number verses lake level sound very much alike and you know they must share 1/2 the data. and the Parana paper was published and reviewed here at WUWT in 2010.

    How can it be that the Scafetta paper was so bad in term of signal-to-noise ratio or correlation in view of an apparent pattern, yet Mauas obtained an r of 0.78 and got published?

    I think that is a hugely appropriate question and the answers I got so far exposes the reality and boy didn’t I get an answer. Mind you, I contributed to neither papers and I do not have any axe to grind in simply stating that I see a relationship in both papers, and I don’t exactly know what that relation ship is. I would like to know.

    Test 1) Would Willis Eschenbach or anyone else review the Parana paper?
    Test 2) Would Willis Eschenbach link them in his review?
    Test 3) Would Willis Eschenbach use the same “rigor”, now, in the present, to trash the paper in view of the obvious implications on the Scafetta paper.
    Test 4) Would commentary follow the same line as in the 2010 or would it pivot to yield a tide of abuse to the paper?

    The answers:

    Test 1) Yes he did. And yes some others did a little number crunching too.
    Test 2) No he did not.
    Test 3) Yes he did it seem so anyway. I’ll take him at his word.
    Test 4) No it was not. It followed the hyperbole in the Scaffetta assault in some respects.

    So rather than seeking approbation, I was seeking the acknowledgement that the two issues are related but up until now treated very differently by the scientific community, by WUWT and possibly by Willis.

    This whole affair reminds me of an unseemly witch hunt of disturbing proportions. What is not apparent is why the extreme abuse of Scafetta, this paper and Copernicus? I want no part of that and it is wrong to employ the tone that came from here.

  73. Paul Westhaver:

    Your rant at January 26, 2014 at 11:39 am says

    This whole affair reminds me of an unseemly witch hunt of disturbing proportions. What is not apparent is why the extreme abuse of Scafetta, this paper and Copernicus? I want no part of that and it is wrong to employ the tone that came from here.

    I don’t need to be “reminded” of anything.

    I can see your rant is an attempt to demean someone who has pointed out inexcusable behaviour and you have no method to defend the indefensible except to ‘shoot the messenger’.

    It is an old lawyers’ saying that if you have no case then pound the law, if you cannot pound the law, then pound the table. You are pounding the table to no effect except to make yourself look ridiculous.

    Richard

  74. “””””……Willis Eschenbach says:

    January 25, 2014 at 10:20 pm

    george e. smith says:
    January 25, 2014 at 9:56 pm

    I always thought filters throw away information. Don’t see how they can add information.

    A poorly designed filter, or a filter that is poorly chosen for a given task, can indeed add spurious information to a signal, and this one is a great example……”””””

    Well I know what you mean Willis; but I tend to believe that the original real measured sampled data values, are the most information you can ever have. And if you did your sampling correctly according to the requirements of the Nyquist sampling theorem, then those samples will indeed be enough to recover the complete original continuous signal; the usual caveats on measurement error limits of course.

    Many immediately make the claim that they aren’t interested in recovering the signal, i.e. the information; they just want to get the average. (filter coming up here).

    They take this as carte blanche (french words) licence to abandon sampled data theory altogether, and just measure now and then, as well as here and there; at places 2,000 km apart if you like; we discover from Hansen et al above.

    Well one only needs a factor of two under-sampling to fold the spurious spectrum back all the way to zero frequency, which is the average that was wanted.

    You have to have valid sampled data, before any kind of filtering is applied, because then you are filtering a completely spurious signal.

    I don’t hold spurious “signals” to be “information” ; it is just noise.

  75. george e. smith:

    At January 26, 2014 at 12:09 pm you say

    I don’t hold spurious “signals” to be “information” ; it is just noise.

    OK, you can make that definition but it helps nothing.

    When you have a smoothed signal then how do you decide what is
    (a) a true signal,
    (b) a spurious signal,
    and (c) the confidence which you can apply to your decision?

    Information will be rejected when true signals are misidentified as being “noise” so are ignored.

    Spurious information will be adopted as ‘true’ when spurious signals are misidentified as being true signals.

    Richard

  76. “Never run a statistical analysis on smoothed data.”

    If you run an analysis on annual temperatures, aren’t you in effect using smooth data that has been averaged over 365 days?

  77. Paul Westhaver says:
    January 26, 2014 at 11:39 am
    I think on balance, this paper is not refuted judging by the comments so far and judging by addition related works brought to the fore.

    #################

    did we land on the moon?

  78. Willis Eschenbach says:
    January 26, 2014 at 1:01 am
    ————————————
    Willis, I was not comparing the SF chart to the ssn number. However it does have some correlation to the Parana River movements. The Pacific NW high water events correlate very close with the Parana River highs. Whereas, I can see by looking at the SF rainfall chart that SF rainfall falls to below average on each of the major Pacific NW, rain induced floods of the 40s, 50s, and 60s. Then in the 70s, 80s and 90s the opposite happens as the peak of the SF pattern coincides with the heavy rain events. The SF rain pattern does something similar when placed with the Parana River highs and lows. This is probably a clue about another modifier acting upon rainfall patterns, the moon very likely, or planetary influences of which I know very little about as to possible effects.

  79. vukcevic says:
    January 26, 2014 at 4:35 am
    ————————————-
    That was very helpful information. I tend to build pictures inside, as I puzzle and ponder. I had a knack for being able to run equations the same way, back in my school years. I should reopen that door. My life has been chaotic though, and I have never found ground to plant my tree.

  80. Why not ask the authors of this paper to explain what thay have done?

    It may be wrong or what they have done may be misunderstood.

    Surely this is the first step?

  81. Unexplained correlations are where scientific inquiry starts.
    Not where it gets buried by smart alec comments.
    Read something about the history of science

  82. RC Saumarez says:
    January 26, 2014 at 4:11 am

    At first sight figure 3 is wrong.

    Take the year ~1988. The original data for the years +- 5 years appear to be negative or weackly positive. Yet according to your calculations the result is strongly positive.

    This is obviously wrong and I do not believe that the correlation between the original and low-pass filtered is 0.1. Why? Because the correlation function is multiplied by the square of filtered spectrum and integrated with respect to frequency.

    I would say that you have dropped a real clanger here.

    COME ON WILLIS, SHOW US YOUR DATA AND SHOW US YOUR CODE!

    Dear god, another fool that can’t be bothered to read, and then blames me when he can’t find his fundamental orifice in the dark …

    Richard, I clearly identified the fact that the data and the code were all on an Excel spreadsheet. However, perhaps you couldn’t find it because I cleverly hid it from those of lower literacy by wrapping it all up in one of those “sentence” thingies, like this:

    DATA: Digitized Parana streamflow data from the paper plus SIDC Sunspot data and all analyses for this post are on an Excel spreadsheet here.

    Take a look through the head post, if you can’t find it then come back and I’ll give you some more clues.

    You end up spewing nastiness because you’re not paying attention … and likely now someone will bust me for being krool to poor Richard, tell me I shouldn’t be so mean, and not say a word about your ugliness and your childish capital letters …

    In addition, Richard, just a day or so ago you were strongly defending Scafetta hiding his data and code over at the Notrickszone, saying that it was just fine for Scafetta to not reveal a damn thing if he didn’t want to.

    Now you want to bust me for exactly what you praise Scafetta for? You slimy little hypocrite, go try that out on someone else, I’m not wearing it.

    w.

  83. Oh, yeah, Richard, one more thing:

    RC Saumarez says:
    January 26, 2014 at 4:11 am

    At first sight figure 3 is wrong.

    Take the year ~1988. The original data for the years +- 5 years appear to be negative or weackly positive. Yet according to your calculations the result is strongly positive.

    This is obviously wrong

    It appears that you don’t understand what happens when you “normalize” data … amazing as it may seem, things that are negative can end up positive when you normalize them. You might try Googling it …

    Or you could just try comparing Figure 3 with Figure 2, there’s a good fellow … if you need further clues, the dashed line in Figure 2 is the red line in Figure 3 …

    w.

  84. vukcevic says:
    January 26, 2014 at 4:35 am

    Hi Willis,
    You comment: I can’t understand his method. It appears that every alternate sunspot cycle has been recorded as a negative number, in order to kinda sorta convert it to a sine wave …

    I don’t think much is wrong with that, providing the reasons are justified, or at least motioned, and they were not in that paper.
    I often do it myself, and if I write narrative (which often I do not!) I mention NASA’s statement as an authoritative back up.

    Thanks, vuk. First, your answer still doesn’t answer my question, which was:

    And indeed, the annual sunspot data does not lend itself easily to flipping. Consider the following annual average sunspot counts:

    1963 27.9
    1964 10.2
    1965 15.1

    Now if you are going to “flip” the cycle that is starting in 1964, do you flip the 1964 data, or do you start the cycle by flipping the 1965 data?

    I ask because I get real nervous when choices a) are arbitrary, and b) have a big effect on the outcome.

    Also, I’m in mystery about “NASA’s statement” about negative sunspot numbers … where would I find that?

    Regards,

    w.

  85. Greg Goodman says:
    January 26, 2014 at 11:06 am

    http://www.vukcevic.talktalk.net/Ap-NHT.htm
    between the most trusted Ap index and the Earth’s N. Hemisphere temperature’s natural variability (narrative may follow some day).

    some day not too far away I hope. This looks impressive but without knowing how you got there it has no value. (No matter how many times you link it on WUWT).

    vuk, let me second this. I’ve given up following links to your plots entirely because like in the one you link above, there is far too little data there to replicate it … and often too little data there to even begin to understand what you are showing.

    Just as an example, on the one you link above, where did you get your temperature data? Is is observational or reanalysis? Why are you using hemispheric temperature data when the aP data you use is global? What are the units on the aP data? Why didn’t you show the southern hemisphere?

    Then we have the theoretical questions. A sixteen year lag between cause and effect? Where does the energy go in the interim? What does the Gulf Stream have to do with it?

    Then we have oddities. In which dataset does the NH only warm by two tenths of a degree from 1900 to 2010? Why are there big swings in the NH temperature data, I’ve never seen that in any dataset. On what planet was 1975 colder than 1900? Is the temperature data detrended, and if so why?

    Then there are the mysteries that I find only on your graphs, and nowhere else. For example, what is “angular momentum propagation within the earth’s liquid core” when it’s at home, and what does that have to do with the price of beer?

    So … I just skip your graphs, and try to follow your posts, and respond when I can …

    Anyhow, a whole slew of sources and comments would help your graphs greatly. And as Greg notes, reposting them doesn’t help.

    My best to you, and please take this in the positive sense in which it is intended,

    w.

  86. w – re linking to the paper : apologies, normally I would. This time I was away from home on an iPad, short of time, and unable to find the paper quickly.

    A minor point, but when you say in your criticism “The authors (not NASA but the authors)“, I think that is an unreasonable distinction : the first two authors are from the Jet Propulsion Laboratory, which is part of NASA, and the third author was “supported by NASA grant NNG04GN02G to the California Institute of Technology.“.

    Regarding statistical significance and noise, the authors do say “We see that the 88-year and 260-year modes are statistically significant at 2 s level against the white noise and at 1 s level against the strongly correlated fractional noise (with the exception of the last mode for the high waters).“, but they also say “The 1000-year long record analyzed by Stager et al. [2005] shows marked correlation between the lake [Victoria] levels and solar variability (proxied by the atmospheric radiocarbon). Also a strong correlation between the atmospheric radiocarbon variations caused by solar variability and the levels of a small equatorial lake Naivasha (Kenya) was found [Verschuren et al., 2000]. Co-occurrence of lake level rises with minima of solar variability [see Stager et al., 2005, Figure 4] continues back in time overlapping with the Nile records used in our paper.“.

    I suppose it’s a bit like the Maunder Minimum, in that correlation with solar activity seems likely, but “proving” it is quite a different matter.

  87. Paul Westhaver says:
    January 26, 2014 at 11:39 am

    I think on balance, this paper is not refuted judging by the comments so far and judging by addition related works brought to the fore.

    I’ve shown clearly that the “smoothing” method that they used totally destroys the original data, changes low points into peaks and vice versa, and astonishingly, ends up with a negative correlation to the data.

    On what planet does that NOT refute their paper? You might re-establish their underlying claim in some other manner, but their paper is toast.

    w.

  88. Paul Westhaver says:
    January 26, 2014 at 11:39 am

    I think on balance, this paper is not refuted judging by the comments so far and judging by addition related works brought to the fore.

    Thanks, Paul. As mentioned above, their paper is refuted by their abysmal smoothing method. But what about their underlying claim?

    Here’s the cross-correlation:

    Note that in contradiction to their claim of perfect temporal alignment, the biggest signal is at a lag of two years. However, even that signal is still a long ways from significance, with R^2 = 0.05 and the p-value = 0.10.

    So unless you know some other way to measure it, I’d say that not only is the paper refuted, but the underlying claim is refuted as well.

    w.

  89. george e. smith: “Well I know what you mean Willis; but I tend to believe that the original real measured sampled data values, are the most information you can ever have. And if you did your sampling correctly according to the requirements of the Nyquist sampling theorem, then those samples will indeed be enough to recover the complete original continuous signal;”

    While it’s best to work with least processed data if possible there are cases where you need to filter. As I said above, if you want to look for correlation on inter-annual to decadal scale that is a order or two smaller than the annual cycle you need to remove it. Also if you want to do spectral analysis on such a signal the annual cycle will saturate the dynamic range and severely reduce the reliability of the rest of the spectrum. There will also be artefacts around the annual signal which will render most of the 0.5 to 1.5 year band useless.

    You may also need to remove autoregression from the data before trying to estimate correlation coeffs or do spectral analysis. That also implies the need to use processed data.

    That’s why I think Briggs’ “you should never…” type comments are ill-informed and unhelpful. Sadly they seem to be getting repeated and linked rather too often.

  90. Paul Westhaver says:
    January 26, 2014 at 11:39 am

    I think on balance, this paper is not refuted judging by the comments so far and judging by addition related works brought to the fore.

    Now compare this paper to the Scafetta paper.

    OK, I’ll compare them. Scafetta refused to share his data and code, so we don’t know what he did or how he did it.

    As a result … I can’t compare this paper to the Scafetta paper.

    Sorry, that’s as far as I can get. I could disassemble and replicate this Parana study exactly. I cannot do that with Scafettas work because, unlike my high-school chemistry teacher, who would give us an “F” if we didn’t show our work, Scafetta gives himself an A+ and doesn’t show anything … sorry, Paul, but that’s not science.

    You go on to state that Scafetta attracts attention here, viz:

    I saw something in the data in the Scafetta. Nothing specific but something worth making the comment that I saw something.

    That yielded such inappropriate and disturbing level of abusive response that I detected that I stepped into something else, something “inside baseball.”

    Yes, there is a backstory. He published a couple of his polemics here, and they got … well, a cool reception, accompanied by calls for his code and data. He got all huffy, refused to share either one, abused us all roundly, and limped off to Tallblokes to lick his wounds. Periodically he comes back, abuses us once again, says if we had any brains we’d recognize his genius, tells me I’m too uneducated to understand his math, things like that, then goes back to Tallblokes and whines about how badly I treat him.

    So when someone steps in and starts prating about Scafetta’s data, well, that’s a sore point around here because he refuses to share, reveal, or archive either his code or his data … sorry you got caught up in it.

    w.

  91. RC Saumarez says:
    January 26, 2014 at 3:51 pm

    Why not ask the authors of this paper to explain what thay have done?

    It may be wrong or what they have done may be misunderstood.

    Surely this is the first step?

    Great plan, RC! Ask them about the effect of the 11-year running mean while you’re at it, see what they say. You should have some interesting news to report back to us, I await their comments.

    w.

    PS: Since I was able to replicate their Figure 2 completely from their data and their paper, I kinda suspect I didn’t misunderstand them …

  92. afjacobs says:
    January 26, 2014 at 4:21 pm

    Unexplained correlations are where scientific inquiry starts.
    Not where it gets buried by smart alec comments.
    Read something about the history of science

    Unexplained REAL correlations are where scientific inquiry starts. The trick is to tell the real correlations from the spurious. We have an entire branch of math dedicated to just that … math which you seem content to ignore.

    There is a clear correlation, for example, between the CO2 levels and the price of US stamps … should we rush to investigate?

    Correlations are everywhere, and time is short. If you want to chase every correlation, significant or not, be my guest—I don’t have the time for that.

    w.

  93. Mike Jonas says:
    January 26, 2014 at 6:24 pm

    w – re linking to the paper : apologies, normally I would. This time I was away from home on an iPad, short of time, and unable to find the paper quickly.

    A minor point, but when you say in your criticism “The authors (not NASA but the authors)“, I think that is an unreasonable distinction : the first two authors are from the Jet Propulsion Laboratory, which is part of NASA, and the third author was “supported by NASA grant NNG04GN02G to the California Institute of Technology.“.

    That’s true, Mike, but it was just a study by NASA authors, not an official NASA study. Those are done and published by NASA, this was done by individual authors and published in a journal by the authors.

    Regarding statistical significance and noise, the authors do say

    “We see that the 88-year and 260-year modes are statistically significant at 2 s level against the white noise and at 1 s level against the strongly correlated fractional noise (with the exception of the last mode for the high waters).“,

    but they also say

    “The 1000-year long record analyzed by Stager et al. [2005] shows marked correlation between the lake [Victoria] levels and solar variability (proxied by the atmospheric radiocarbon). Also a strong correlation between the atmospheric radiocarbon variations caused by solar variability and the levels of a small equatorial lake Naivasha (Kenya) was found [Verschuren et al., 2000]. Co-occurrence of lake level rises with minima of solar variability [see Stager et al., 2005, Figure 4] continues back in time overlapping with the Nile records used in our paper.“.

    I suppose it’s a bit like the Maunder Minimum, in that correlation with solar activity seems likely, but “proving” it is quite a different matter.

    You do understand that their first statement means the NIle results are NOT significant, don’t you? The rest just seems like handwaving to re-establish that they are significant … but they aren’t.

    Finally, if this current Parana study has taught you anything, it should have taught you to be very, very suspicious of this kind of analysis. There are many, many pitfalls, and even older papers like this one can contain egregious errors like their 11-year running mean “smoothing”.

    Next, yes, I’m sure they can find the occasions river, or one lake, that has some correlation with sunspots over some period … but by the time they’ve looked at four rivers to find that one that is “significant”, they’ve forgotten that they need to adjust significance levels to allow for repeated trials.

    So as I said at the top, there may be a relationship … but given how long people have looked, and how weak and how little they’ve found, any reasonable person would have to admit that IF such an effect exists, it’s not a very big effect …

    My best to you,

    w.

  94. Greg Goodman says:
    January 26, 2014 at 7:09 pm

    … While it’s best to work with least processed data if possible there are cases where you need to filter. As I said above, if you want to look for correlation on inter-annual to decadal scale that is a order or two smaller than the annual cycle you need to remove it. Also if you want to do spectral analysis on such a signal the annual cycle will saturate the dynamic range and severely reduce the reliability of the rest of the spectrum. There will also be artefacts around the annual signal which will render most of the 0.5 to 1.5 year band useless.

    You may also need to remove autoregression from the data before trying to estimate correlation coeffs or do spectral analysis. That also implies the need to use processed data.

    That’s why I think Briggs’ “you should never…” type comments are ill-informed and unhelpful. Sadly they seem to be getting repeated and linked rather too often.

    My own feeling is that I’d rather people took Brigg’s advice than not, the resulting problems would be less. Yes, you are right that there are times when filtering is not an option, it is a necessity. However, in climate science those times are not as common in say electrical signal analysis.

    And certainly, when you see egregious examples such as the choice of the 11-year filter in the Parana paper by people who are established scientists, you’ve got to agree that casual filter use is an issue in the field …

    One recurring problem that gets too little attention is the common practice of removing the “climatology”, that is to say subtracting the month-by-month average values of the variable in question. Those “reduced” values are then taken as accurate datapoint when doing things like calculating trends.

    But the problem is, once you remove the monthly values, the resulting points all inherit the standard error of the mean in the climatology … and how often is that taken into account when calculating the trend in say the satellite tropospheric temperatures? We just use the regular methods for the error in trend, without including the additional error inherent in the climatology.

    Me, I use filters a lot to help me understand what’s going on, by overlaying a gaussian or a loess average over the data. But I do my best to avoid running statistical tests on filtered data. Yes, you can kind of adjust for the increased autocorrelation caused by the smoothing by reducing the effective “N”, the number of data points. And in fact, often you have to adjust for autocorrelation even when there is no smoothing.

    But in general the methods are ad-hoc, and often either over- or under-estimate the actual significance of the results. So I try to keep it as simple as possible.

    And having had to deal with the kind of garbage smoothing we see in this study far too often, I say, like Briggs, DON’T RUN STATISTICAL TESTS ON SMOOTHED DATA … knowing full well that people like yourself who know what they are doing and know what kind of filters to use and why they are using that particular filter, well, you’ll filter anyhow, as well you should.

    Regards,

    w.

  95. Willis: “However, in climate science those times are not as common in say electrical signal analysis.”

    I agree with your comment in general, it’s best to err on the side of caution. However, the need to remove the annual cycle in climate science is as omnipresent as the need to remove mains ‘humm’ in audio electronics.

    The only way you can avoid filtering it is by doing something silly like subtracting a “climatology” which of course also affects the degrees of freedom.

    When I find this kind of detail is available from satellite data and they are splashing around “climatologies” of monthly averages and running means it make me want weep:

    http://climategrog.wordpress.com/?attachment_id=756

    Hardly surprising there’s been so little progress in the last 20 years.

  96. @Ox AO: thank you.
    It would be interesting to repeat the analysis using the Parana River data with a better time-resolution (monthyl values). I will have a look If I can find these data …

  97. One of the problems in the paper is the reliance on the so called ‘stream flow’ which is a very rough proxy for rainfall and can be highly variable even if the rainfall is consistent.

    I have proposed that solar effects cause latitudinal shifting of climate zones so the effect on rainfall is hard enough to correlate with solar activity let alone the consequent stream flow because a given area can have a complex relationship with the rain bearing winds above as they move to and fro latitudinally.

    Even the shifting of climate zones is ‘noisy’ around the globe as one can see from the variable locations of the ‘dips’ in the jet stream tracks from year to year.

    So, if there is indeed a solar / stream flow relationship it is bound to be very poorly correlated but that does not mean that there is no relationship.

    It more likely means that to discern the relationship one needs to observe over longer periods of time than a single solar cycle.

    The sort of indicator I would see as useful would be as to whether the stream flow changes on average across decades in line with the long slow and very irregular changes in solar activity from say the MWP to LIA or LIA to date.

    On those timescales I think there is a relationship as witness the rise and fall of civilisations as the climate changed around them over multi-centennial periods. Many such developed due to abundant supplies of natural resources but declined as the environment altered around them.

    So my take on this thread is that Nicola is being too ambitious in trying to discern (let alone prove) significance at the level of just a few solar cycles and Willis may be right in calling him out on that but there is still room for a real relationship between sun, global air circulation and stream flows around the globe over longer periods of time.

  98. Hi Willis
    Thanks for finding time and effort, my posts demonstrate most vividly how not to do science, the reason I do not object to the stuff dismissed as pseudoscience etc. It is more of ‘what if ?’ than ‘it is’.
    You made number of very valid points, so I shall make an attempt to explain, but it is not necessarily correct.
    – On SSN number flipping. Scan down the annual data, the lowest score around a minimum will change sign. In your example:
    1963 +27.9
    1964 -10.2
    1965 -15.1
    – Arbitrary choice (cherry picking) is deplorable, but in this case, where I attempt to demonstrate principle, rather than project accuracy (neighbouring cycles in practice do overlap by number of years anyway) it doesn’t make great deal of difference.
    – NASA’s statement, I have added link on the graph , but here it is:

    http://science1.nasa.gov/science-news/science-at-nasa/2008/16dec_giantbreach/

    On this point Dr. S has made number of points at your ‘SSN & sea level thread.
    It is not disputed that both SSN and Ap index data show difference between the even and odd cycles. A young solar scientist wrote about it 45 years ago see: Fig. 23 in http://www.leif.org/research/suipr699.pdf , and now is going to revisit the issue.
    It is a mater how these differences are treated, If a median is calculated and normalised to zero, than the odd cycles remnants will be positive and even negative (with one or two exception), i.e. it is not a robust rule. I simplified process by not following that procedure, but used easy (lazy man) option and inverted whole cycles (I did say at the start ‘how not to do science’).
    – I also agree entirely with what Greg and you said at the top of your next comment, and if I may add, even if all calculations are shown in fine detail, without physical mechanism, it is no more than numerology.
    – Temperature data is from NASA-GISS (de trended, 3 year moving average), Ap index provided by Dr.S. These details are now on the graph link(see above).
    Ap data from individual stations are covering only recent decades, back to 1840s are reconstructed from SSN. (Dr.S is an authority on this one).
    Why N. Hemisphere? It is very different to S. H, as far as distribution of geo-magnetic field intensity, i.e. existence of the field bifurcation . The physics I have in mind, don’t think would work in the S.H. where field is omni-centred
    – 16 year lag is part of the mechanism I have in mind, there are number of papers, from NASA-JPL (on google scholar use ‘Earth differential rotation Dickey’, too many links to quote directly )
    – Gulf stream is the critical factor, it moves heat energy from subequatorial to mid and high latitudes, it is principal contributor to the subpolar gyre’s SPG circulation, which is the engine of the heat transport across the North Atlantic Ocean ( feedback circulation delays in the SPG as mentioned above -google ‘subpolar gyre feedback: Treguier et al 2004,, Levermann et al 2007, Born et al 2011 )
    – Energy is not going anywhere, it is in the N. Atlantic ocean it is just mater of one of two, or both feedbacks (Earth differential rotation & SPG circulation) when energy is released.
    Then we have oddities :De-trended anomaly (NASA-GISS) as stated above
    Angular momentum in price of beer papers by Dickey et al, as above.
    I just skip your graphs wise move, most of others do it too.
    take this in the positive sense Will do, but only in the even numbered solar cycles.
    This must be the longest of my posts, but I thought it is only fair to ‘answer’ all of your questions (hopefully not nonsense in its entirety , pseudo-science is unavoidable). I have learned my lesson, I shall avoid addressing you again.
    Thanks, had lot of fun reading and replying to your observations, all the best.
    regards. vuk.

  99. Blair says:
    January 26, 2014 at 7:14 am

    They simply don’t know when they get something wrong because they rely entirely on software. It has happened so frequently we laughingly gave it a name; The sorcerer’s apprentice syndrome. Apparently there is also a button which says, “Design The HVAC systems”.

    They do however, show no lack of confidence in their results.

    The late Bob Pease used to rail about this frequently in electronic design. He said – you design with a “slide rule” and check with a circuit simulator – if you use a simulator at all. And don’t depend on the simulator for numerical results. Use it for finding trends – if I increase this resistor it changes the operation this way. The problem is particularly acute in the tuning of PID loops which are common in chemical plants and HVAC systems.

    I have designed autotune functions for PID loops. If the data your plant produces is very noisy they can produce terrible results. (Ah – back to the theme of this post) But you can’t sell controllers without it these days so….

  100. However, the need to remove the annual cycle in climate science is as omnipresent as the need to remove mains ‘humm’ in audio electronics.

    The mains frequency is going to be 60 Hz (50Hz) to better than 0.1% Easily filtered. You know quite well when to expect events (peaks, zero crossings). And if your filter is narrow it hardly affects how you hear the music. Now apply an annual filter to noisy data. Can you always expect snow on 1 Jan? Does the summer temperature always peak on 10 August? (plus or minus 3.65 days to get us in the .1% band). And BTW the mains are recentered to reduce absolute phase drift to well under 1 second (60 cycles) indefinitely so your clocks which time off the mains don’t go drifting away from real time too much. Your VCR, DVR, microwave, etc wouldn’t like that. Not to mention synchronous motor clocks.

    The recentering is to NIST time which is good to 1E-12 or better over long periods. You start with a low drift oscillator and then discipline it. That is how you get local short term accuracy. The filter times get very long compared to the oscillator frequency. Because the NIST transfer source (satellites) are noisy (+/-1E-7 jitter). You start with short filters to get close. And then make them longer.

  101. And just to clarify “And then make them longer.” refers to averaging times. Every 100X increase in averaging time gets you a 10X increase in accuracy. (it is a square root function). What you rely on is the one second tick from your GPS receiver. .

  102. scf says:
    January 25, 2014 at 5:31 pm

    I have two grandfather clocks. The timing of the chimes is very strongly correlated, to over a 99% level. Also, one of the two always chimes first, so that one must be causing the other one to chime.
    When the first one of them stops chiming so does the other, so this confirms the causation (whenever a power failure occurs).

    This comment is no sillier than the Parana river study.

    Well that is interesting. It turns out that two grandfather clocks that are coupled in any way (placed on the same wooden floor for instance) will synchronize. They cease to be independent time keepers. You have to go to a LOT of trouble to isolate them from each other. Studies have been done on this.

    https://www.google.com/#q=grandfather+clock+coupling

    Injection locking is another good search term.

  103. M Simon:

    To help others I insert a link in your comment and add an observation.

    Willis Eschenbach says here:
    January 26, 2014 at 5:21 pm

    Tasty.

    And true.

    Richard

  104. Nice article on coupled clocks http://blogs.unimelb.edu.au/sciencecommunication/2012/10/28/coupled-oscillators-and-the-tale-of-huygens-clocks/

    With this video. http://youtu.be/DD7YDyF6dUk

    Obviously they went to a lot of trouble to make the coupling reasonably good. It is a short video after all. Looks like the side off an old computer (the table) mounted on some handy rollers (mailing tubes). Clocks with integer related periods will synchronize given enough time. And the period relation need not be an exact multiple. Numbers like 3:5 will work given enough time. Although numbers like 7:31. Rarely work. The repeating period becomes too long and chaos takes over.

  105. richardscourtney says:
    January 27, 2014 at 5:12 am

    I usually search (‘find’ function) on a time. Such as “5:21″ which usually produces one result on a page.

    Or I will use a phrase mentioned to see all comments about it. Your method certainly produces much more exact results. But it is a lot more work.

  106. M Simon:

    At January 27, 2014 at 5:37 am you say to me

    I usually search (‘find’ function) on a time. Such as “5:21″ which usually produces one result on a page.

    Yes, and ctrl-f does the same but is quicker.

    One needs knowledge of such methods to use them. Hence, providing a link helps people.

    Richard

  107. This bit of text goes exactly to the discussion here.

    From: http://blogs.unimelb.edu.au/sciencecommunication/2012/10/28/coupled-oscillators-and-the-tale-of-huygens-clocks/

    The implications of this field are more far-reaching than messing with timepieces however. Systems of coupled oscillation turn up in the mating flashes of fireflies along the tidal rivers of Malaysia, in the gait of a horse (trot, gallop or canter) and in your very own footsteps when walking next to someone on the way to 7-11. More importantly, they influence the mechanic behaviour of fluids and electromagnetic fields.

    Sometimes it only takes the smallest of observations to make a big discovery.

    Never stop being curious! This is Ryan, signing off.

  108. It appears, under the pressure to “publish or perish”, Mauas, Flamenco, and Buccino have relied upon the old white-collar maxim: “If you can’t dazzle them with brilliance, baffle then with BS.”
    What they have overlooked in their zeal is that, in the Information Age, everything that goes into cyberspace stays there forever, and the old gunslinger’s maxim that “No matter how good you think you are, sooner or later you’ll come up against someone just a little bit better.”
    In the long run researchers will be so much better off if they stick to their home turf, try to get everything perfect the first time, and admit their own errors (they’ll learn from them if they do).

  109. M Simon says
    The implications of this field are more far-reaching than messing with timepieces however. Systems of coupled oscillation turn up in the mating flashes of fireflies along the tidal rivers of Malaysia, in the gait of a horse (trot, gallop or canter) and in your very own footsteps when walking next to someone on the way to 7-11. More importantly, they influence the mechanic behaviour of fluids and electromagnetic fields.

    Sometimes it only takes the smallest of observations to make a big discovery.
    Never stop being curious! This is Ryan, signing off.

    Henry says
    sorry to see you go. You have a point there. Actually the flowrate of a river is somehow related to the amount of rainfall, at certain latitudes at certain times and as Stephen pointed out earlier, rainfall stats has its pitfalls, due to its high variability. In its turn, the amount of rainfall at a certain latitude depends again on the amount of energy coming through the atmosphere. Due to the equator-pole differential, more energy coming in just means more clouds and rain travelling to the higher latitudes and less energy coming in means more rain and clouds at the lower latitudes and less rain and clouds travelling to the higher latitudes.
    Hence my question here

    http://wattsupwiththat.com/2014/01/25/sunny-spots-along-the-parana-river/#comment-1550138

    to which nobody here even has an answer …..
    The best way to evaluate the amount of energy coming in, is to look at maximum temperatures.
    Now why on earth am I only the person to have done just that?

    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    “Clueless” comes to my mind here.

  110. There have been articles on how badly done most of the work is in Climate summation.

    If you want a tutorial on how bad this all is, then go and read this thread on Judith Curry’s site.

    http://judithcurry.com/2013/11/22/data-corruption-by-running-mean-smoothers/

    Data corruption by running mean ‘smoothers’
    Posted on November 22, 2013 by Greg Goodman

    or visit Greg’s own site for the same article.

    http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/

    Worth trying a 15 year low pass FIR filter (CTRM with appropriate parameters of 180, 149 and 132) on the data sets to get a clean multi-decadal view of what is really there I think.

  111. Greg Goodman says:
    January 26, 2014 at 9:51 pm

    However, the need to remove the annual cycle in climate science is as omnipresent as the need to remove mains ‘humm’ in audio electronics.

    The only way you can avoid filtering it is by doing something silly like subtracting a “climatology” which of course also affects the degrees of freedom.

    Or use a Cascaded Triple Running Mean with 12, 10, 8 as values :-) Removes the need for Normals or any sort or period. Less bias that way .

    And for Weather/Climate binary chop how about a CRTM with values of 180 months, 149 and 132.

    Gives a nice 15 year decadal/multi-decadal stop/pass band. A Weather/Climate Occam’s Razor of a filter.

  112. To me, Willis’ insistence that you can’t run statistical analysis on smoothed data is a bit like insisting that radio transmission can’t possibly work. If I record all the radio noise coming into my house, when I look at it in the raw it won’t look like it correlates to anything at all, it’s just noise. But if I smooth it through a filter and run it through a speaker and suddenly discover a high correlation between it and what someone across town is saying into a microphone, should that be dismissed as a spurious correlation?

    What they’ve found is that sunspot data and riverflow data have a high correlation over some portion of the data’s bandwidth, and I think that’s fully true and legitimate. Given the 100 year length of the data and the 11 year centering of the bandwidth, I don’t think it’s enough to say anything absolutely conclusive, but it’s interesting, and makes it at least worth thinking about whether there’s a mechanism there that could explain it. Certainly, just pointing out the obvious, that smoothing changes the way the data “looks” isn’t enough to consider the paper debunked.

  113. By the way, I’m responding ONLY to the conclusion that you shouldn’t do statistical analysis on smoothed data. I don’t really know anything about other attempts to link sunspots to climate, and agree that if this is meaningful it should be seen in other places as well. But the idea that smoothing data ruins it is not a correct conclusion.

  114. btw
    just in case you did not know: it is cooling now
    for the duration of at least one Schwabe solar cycle (12 years)

    http://www.woodfortrees.org/plot/hadcrut4gl/from:1980/to:2012/trend/plot/hadcrut3gl/from:1980/to:2012/plot/hadcrut3gl/from:1980/to:2012/trend/plot/rss/from:1980/to:2012/plot/rss/from:1980/to:2012/trend/plot/hadsst2gl/from:1980/to:2012/plot/hadsst2gl/from:1980/to:2012/trend/plot/hadcrut4gl/from:1980/to:2012/plot/hadcrut3gl/from:1980/to:2012/trend/plot/hadsst2gl/from:1980/to:2012/trend/plot/rss/from:1980/to:2012/trend

    as expected, it is cooling from the top [90] latitudes downwards, see here,

    that means more rain at the lower latitudes and less at the higher latitudes.
    paradoxically, as always with the local weather, the less rain, the warmer it gets, ….

    http://blogs.24.com/henryp/2013/04/29/the-climate-is-changing/

    making (some) people think/claim that there is still global warming….

  115. Nancy C says:
    January 27, 2014 at 10:49 am

    I agree. I cannot understand peoples reluctance to use simple low pass filters on the data to separate out Weather from Climate. (see above).

    That is what everybody SAYS they want to do.

    But when you point out that standard audio/radio/power/just about every other branch of science uses them all the time you are met with such stubborn resistance.

    Why I don’t know, but it has become pathological on both sides of the debate to ignore the blindingly obvious.

    And who in their right mind would run an FT on data so short with such a large amount of noise in the mix. No chance at all, ever, of getting any real peaks in the response. You can fit just about anything you like through the data window you have in Fourier terms and get no real clear outcomes. Just smeared all over the spectrum. That’s the real maths for you.

    Yearly signal

    http://www.woodfortrees.org/plot/uah/plot/uah/mean:12/mean:10/mean:8

    GISS

    HadSST

    HadCrut

    None so blind who will not see.

  116. Indeed: Spurious Correlation from Filtering

    I find that if you start with two random (white) sequences, low-pass them with moving average, and especially if you also detrend with a high-pass (net result – band-pass) you get bands of noise around the band-pass peak, and these of course then emerge in the correlation as belonging to both. Right where you PUT THEM. Here is my work so far.

    http://electronotes.netfirms.com/AN403Draft.pdf

  117. Bernie Hutchins says:
    January 27, 2014 at 8:11 pm

    “Indeed: Spurious Correlation from Filtering”

    I’ll add to that:

    Stepwise integration as a practical methodology for constructing a long term RMS power meter.

  118. Climate Scientist: I want a tool to examine Climate Temperatures.

    Geek: How do you define Climate?

    Climate Scientist: Longer than 30 years.

    Geek: So you want a tool that will show how the planet’s temperature responds in periods of more than 30 years?

    Climate Scientist: Yes.

    Geek: Well basic theory says that a Low Pass filter with a corner frequency of 15 years will do exactly what you want.

    Climate Scientist: But that’s not complicated enough and anyway that does not show me what I like to see. It says that there are natural oscillations in the signal and my theory says they don’t exist.

    Geek: ??????????

  119. HenryP says:
    January 27, 2014 at 8:40 am

    Hence my question here

    http://wattsupwiththat.com/2014/01/25/sunny-spots-along-the-parana-river/#comment-1550138

    to which nobody here even has an answer …..

    Your question was:

    I just need to know in what town the flowmeter is situated and which way does the river flow? Is it north to south or south to north?

    Me, I paid no attention to your question because it revealed that you hadn’t bothered to read the paper I linked to. I suspect others may have done the same. If you can’t be bothered to read, I can’t be bothered to answer. The very first page of that paper says:

    Here we analyze the stream flow data measured at a gauging station located in the city of Corrientes, 900 km north of the outlet of the Paraná.

    Here are the clues. Corrientes. North to south. Now you are claiming that nobody here has an answer? Well, somebody here doesn’t … the rest of us read the study.

    The best way to evaluate the amount of energy coming in, is to look at maximum temperatures.
    Now why on earth am I only the person to have done just that?

    Oh, I see. You’re the first person to ever think of looking at the trends in maximum temperatures. I mean, the fact that a google search turns up a million pages for trend “maximum temperatures” with the first one being entitled “Variability and trends in daily minimum and maximum temperatures” supports your claim of primacy, right?

    “Clueless” comes to my mind here.

    Indeed it does, HenryP … indeed it does …

    w.

  120. Nancy C says:
    January 27, 2014 at 10:49 am

    To me, Willis’ insistence that you can’t run statistical analysis on smoothed data is a bit like insisting that radio transmission can’t possibly work. If I record all the radio noise coming into my house, when I look at it in the raw it won’t look like it correlates to anything at all, it’s just noise. But if I smooth it through a filter and run it through a speaker and suddenly discover a high correlation between it and what someone across town is saying into a microphone, should that be dismissed as a spurious correlation?

    Thanks, Nancy, you raise an interesting point. Of course, you are right that there are various situations where, through trial and error, we have discovered how to successfully filter out unwanted signals to reveal the underlying relationship of interest.

    So to use your example, we tune a radio receiver to filter out unwanted signals. Please note however, that this is NOT a smoother. It is a filter. The difference, at least in how I use the words, is that a smoother (such as the 11-year running mean that they used on the Parana data above) merely redistributes the energy present in the signal. A filter, on the other hand, actually removes energy from the signal.

    So in the radio example, yes, to extract the audio signal we need to filter out unwanted radio frequencies.

    What we don’t do, though, is use a smoother to recover the audio signal from a radio broadcast. A smoother such as they have used on the Paraná River data would just make a jumbled mess out of the radio signal.

    I hope this clarifies my statement. A well-designed filter is a piece of art. A poorly designed smoother is an abomination. However, even the best of smoothers can introduce purely spurious apparent correlations, and therein lies the problem. From a comment above:

    Loynes, R. M. 2005. Slutzky–Yule Effect. Encyclopedia of Biostatistics.

    Abstract
    Smoothing a time series by forming a moving average is a commonly employed approach. In this article, some of the problems that arise are discussed, in particular, the introduction of correlations even when the observations in the original series were independent.

    Do you see the part about how it introduces correlations even though the series are totally independent? That’s the issue. Smoothing generates bogus correlations out of nothingness.

    I discuss this and give some examples in my post called “Data Smoothing and Spurious Correlation

    Best regards,

    w.

  121. @Willis
    sorry I missed that
    but I knew my estimate

    http://wattsupwiththat.com/2014/01/25/sunny-spots-along-the-parana-river/#comment-1550223

    would not be too far out
    especially that it was running exactly opposite the direction of the Nile…
    Perhaps I am not so clueless?

    Many things appear on the internet if you search
    but show me anyone that shows me the correct trend in dropping maxima?

    The world is cooling my friend, from the top [90] degrees latitudes down

    causing more rain [30] latitudes, on average.
    If anything just try to understand this about the weather

    just live with it

    or (rather) move

    http://blogs.24.com/henryp/2013/04/29/the-climate-is-changing/

    God bless you all.

  122. RichardLH says:
    January 27, 2014 at 3:32 pm

    Nancy C says:
    January 27, 2014 at 10:49 am

    I agree. I cannot understand peoples reluctance to use simple low pass filters on the data to separate out Weather from Climate. (see above).

    That is what everybody SAYS they want to do.

    But when you point out that standard audio/radio/power/just about every other branch of science uses them all the time you are met with such stubborn resistance.

    Richard, see my answer above. There is a big difference between a properly designed filter, and a smoother such as they have used in the Parana River study.

    w.

  123. my previous comment did not come out right
    let me try again

    @Willis
    sorry I missed that. no need to be nasty.
    but I knew my estimate

    http://wattsupwiththat.com/2014/01/25/sunny-spots-along-the-parana-river/#comment-1550223

    would not be too far out
    especially that it was running exactly opposite the direction of the Nile…
    Perhaps I am not so clueless?

    Many things appear on the internet if you search
    but show me anyone that shows me the correct trend in dropping maxima?

    The world is cooling my friend, from the top [90] degrees latitudes down

    causing more rain at less than [30] latitudes and less rain greater than [30] latitudes, on average.
    If anything just try to understand this about the weather

    just live with it

    or (rather) move

    http://blogs.24.com/henryp/2013/04/29/the-climate-is-changing/

    God bless you all.

  124. “””””…..Willis Eschenbach says:

    January 28, 2014 at 12:21 pm

    Nancy C says:
    January 27, 2014 at 10:49 am

    To me, Willis’ insistence that you can’t run statistical analysis on smoothed data is a bit like insisting that radio transmission can’t possibly work. If I record all the radio noise coming into my house, when I look at it in the raw it won’t look like it correlates to anything at all, it’s just noise. But if I smooth it through a filter and run it through a speaker and suddenly discover a high correlation between it and what someone across town is saying into a microphone, should that be dismissed as a spurious correlation?…..”””””

    Well at first glance, Nancy C. ‘s argument seems rather compelling. She has all this radio noise coming into her house, so she SMOOTHS it, and discovers it is all due to the chap across town talking into a microphone.

    So I tried her experiment, and I didn’t get anybody talking at all. All I got was some crummy Italian Opera broadcasted from the Met, in NYC.

    Must be something wrong with my filter ? So I re-tweaked on my filter, and blow me down, if I didn’t pick up some caterwalling from the Grammy self aggrandizement festival the other night.

    Apparently, what you end up with which YOU interpret as SIGNAL is totally dependent on how you designed your filter. My experiments, seem to indicate that in addition to what totally seems like random noise, there are some other; perhaps a whole lot of real signals coming into Nancy’s house, and she simply wiped out most of them, because they didn’t really support her pre-conceived opinion about what was coming into her house.

    Likewise, if you filter a composite signal climate data set, you will turn it into a faux data set, which pre-emptively support the conjecture that led to you examining this data in the first place. Other influences that aren’t close to your desired “signal” in frequency spectrum, will be expunged, and their role in the reality, will be suppressed.

    We seem to be thoroughly mixing up signal processing technologies, with statistical data manipulations.

    Real signals are here in real time, and then they are gone, never to be repeated. So what is there to do statistics on.

    Richard LH tells us that if we are looking for a climate signal that has 30 year periods for the changes being sought, we should run the data through a filter with a 15 year corner period.

    So Richard’s “Climate signal” evidently is band limited with a drop dead cutoff frequency of 1/30year. NO signal components faster than 1/30year frequency.

    Well his preferred signal recovery filter has a corner frequency (presumably at -3dB) of 1/15year; twice the bandwidth of his signal.

    Well yes it will do a bang up job of detecting virtually all of the energy in his desired signal, but it also is going to pick up all of the other signals and random, noise in his detection bandwidth of 1/15year frequency range.

    Well if the signal to noise ratio of his 1/30year signal is very high, he will in fact get a pretty good replica of it. Old school Oscilloscope engineers (IR1) would recommend a scope bandwidth of ten times your signal bandwidth if you want to get good high fidelity time domain response on your screen, but if you are pushing the speed limits, then double the bandwidth is a damn fine second choice. If the scope bandwidth is equal to or less than your signal bandwidth, then you are going to get distortions, and particularly a lousy transient response; but if you are clever and careful, you can compensate for the deficiency, and make better conclusions about what the signal really is.

    But if you have a very low signal to noise ratio, or even a very high noise to signal ratio, such as with a Loran-C signal for example, excess bandwidth, like Richard LH advocates, is a real enemy.

    Some people try to get clever with a “brick wall” filter or Tchebychev, or Butterworth filters, with cutoffs matching their signal bandwidth. This is rather hazardous, as such filters produce time domain transient responses (overshoots) that add spurious information, right at the cutoff frequency, where your presumed signal is sitting, so you get a fake response. You also do not get the best filtered signal to noise ratio, that is available to you.

    In the case of the Loran-C signal for example, where the real time domain response of the transmission you are trying to detect, is precisely known, it is often considered best to use a Gaussian response filter, which produces no time domain overshoots, and to set the -30 dB cutoff frequency of that filter to the Loran carrier frequency (100kHz), rather than the – 3 dB cutoff. Now Gaussian filters roll off very slowly in the stop band, so the – 3dB frequency of the optimal filter is well below 100 kHz, but the signal loss is quite small, but the noise attenuation of the reduced bandwidth is a big advantage. Loran-C receivers also take advantage of the fact that they know exactly when the signals are transmitted, so they don’t even have the radio turned on, until the expected time of arrival of the signal (it’s a periodic stream of pulses).

    It takes a long time to figure out where the signal arrival time is, so you have to search, and as you gather more information, you improve your search method.

    Well GPS does the same thing; once you find some sort of signal, you can set your clock to the correct (atomic) time, which makes it much easier to locate other signals.

    But Nancy C.s approach, will wipe out all the other things that make the Piranhas run up the river, besides sunspots, and the wrong conclusions will be made.

    As for trying to determine the average Temperature of the earth (if that is possible), what for ??

    The heating and cooling of the earth is a quite non-linear function of the LOCAL TEMPERATURES. Outgoing total radiant emittance varies with the 4th power of the Temperature; not linearly with the Temperature. And the peak spectral radiant emittance varies as the 5th power of the Temperature (on a wavelength scaled plot).

    So what the hell good is an average Temperature. This fallacy is a result of the often stated axiom, that climate is the average of the weather. It is NOT; climate is the INTEGRAL of the weather.

    Mother Gaia, sees everything in real time, and if a dark cloud appears in the sky over a patch of ocean, and casts a dark shadow on the surface, Mother Gaia, immediately starts deducting the Joules that no longer are arriving at that spot, and subtracts them from the ocean’s energy assets ledger. An hour later when that cloud has gone, due to a tropical deluge on a nearby island, that the cloud was actually over, the sun will resume supplying the Joules that were originally contracted for, by that ocean spot.

    GISS will still be asleep, and never even see the cloud; but absolutely nothing escapes, Mother Gaia’s attention; and she always gets her energy budget accounting correct.

    The average(ing) Joe never gets it right.

    By the way, recovering a SIGNAL with the highest fidelity and accuracy, requires recovering ALL of the spectral components of that signal, so simply brute force SMOOTHING as Nancy C puts it, will thoroughly distort the true signal.

    Now when I was actually doing Oscilloscope design, I used to (jokingly) quip, that what the world really needed, was slower and slower Oscilloscopes.

    You put a really high bandwidth oscilloscope on your nice smooth flowing signal, and the damn thing adds all kinds of glitches, and rings and dings to completely screw up your nice serene signal; we should do away with fast oscilloscopes, because they just mislead us.

  125. george e. smith says:
    January 28, 2014 at 3:08 pm

    “You put a really high bandwidth oscilloscope on your nice smooth flowing signal, and the damn thing adds all kinds of glitches, and rings and dings to completely screw up your nice serene signal; we should do away with fast oscilloscopes, because they just mislead us.”

    More like a single trace storage scope in reality.

    Simple answer, put a 15 year low pass filter on the data (to sort out the climate signal from the rest) and away you go.

  126. I can never figure out why everybody is so happy to stop filtering at Month and/or Year.

    Means (a form of filtering) in their single form are so crap mathematically (like a very crap audio filter) and they are much better replaced by a near Gaussian alternative, the Cascaded Triple Running Mean, with parameters set to be a very good ‘stop band/pass band’ discriminator of Day-Month-Year v Climate signals.

  127. george e. smith says:
    January 28, 2014 at 3:08 pm

    “So Richard’s “Climate signal” evidently is band limited with a drop dead cutoff frequency of 1/30year. NO signal components faster than 1/30year frequency.”

    No this is a filter of exactly the same form as a broadband filter on an Internet connection.

    Telephone one way. Broadband the other. As flat a filter as you can possible get. A binary chop of the data into two bins.

    Day-Month-Year-Decadal = stop band.
    Climate = pass band.

    That’s how Gaussian broadband low pass filters work.

  128. george e. smith says:
    January 28, 2014 at 3:08 pm

    By the way, I did say a 15year cut-off point. Nicely placed between Decadal and Multi-decadal aslo.

    As Climate is normally defined as >30 years seems OK to me.

  129. HenryP says:
    January 29, 2014 at 10:19 am

    “Henry says
    15 years is clearly stupid.”

    If I was cycle hunting I would agree.

    However that is not what is happening here. This is pure observation. A summary of the data and what it says happened.

    Nothing more than extending the Day – Month – Year pattern to longer time scales.

    What does the DATA say?

  130. HenryP says:
    January 29, 2014 at 11:37 am

    “@RichardLH
    my collected data”

    So if you were to track those down time you would be able to express the changes in the sampled values.

    This, depending on your sampling methodology, may represent how the overall system is changing.

    Doing it with as large a selection of data as possible, from as wide an area as possible, as often as possible, is likely to get you a more accurate result though. Advantage in larger number of samples. Better statistics.

    Just as the satellites are likely to be more accurate than a land only (mostly) sampled sub-set.

  131. #RichardLH

    My feeling is exactly opposite
    here are 4 major datasets

    http://www.woodfortrees.org/plot/hadcrut4gl/from:1987/to:2014/plot/hadcrut4gl/from:2002/to:2014/trend/plot/hadcrut3gl/from:1987/to:2014/plot/hadcrut3gl/from:2002/to:2014/trend/plot/rss/from:1987/to:2014/plot/rss/from:2002/to:2014/trend/plot/hadsst2gl/from:1987/to:2014/plot/hadsst2gl/from:2002/to:2014/trend/plot/hadcrut4gl/from:1987/to:2002/trend/plot/hadcrut3gl/from:1987/to:2002/trend/plot/hadsst2gl/from:1987/to:2002/trend/plot/rss/from:1987/to:2002/trend

    showing it is cooling
    Add to this my own three data sets showing it is cooling from 2000
    UAH is going the opposite direction
    so my question is
    how is sat. data collected and how do they calibrate?
    how do you know that the absolute zero in the solar system is constant>?
    maybe it just shifts as time goes by?

  132. HenryP says:
    January 29, 2014 at 12:16 pm

    “My feeling is exactly opposite”

    My maths (using OLS alignment to get the data sets to a common position) says different.

  133. HenryP says:
    January 30, 2014 at 12:35 pm

    “@richard
    To find the correct cycle we are in you must do your own research. Otherwise you will never get it right”

    I have done and part of the results are shown above.

    Also applied cold, hard logic to what I see. And well known and understood methodology with solid engineering support to implement it.

  134. @richard
    Your first two graphs chosing a linear fit 1979-2014 is not a good choice,
    Surely you must know that the temperature is reacting-/due to a non-linear situation, i.e.

    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    If you look at the second graph in my blog post there, you will see why it makes more sense to start looking from say the beginning of the millennium? How else would anyone try to explain to me what I found here in Alaska? (9 out of 10 weather stations showing cooling. The tenth one possibly has an erroneous entry for 2000).

    It is cooling my friend, from the top [latitudes] down.

    As I said I have my doubts about UAH and the satelite data in general, mostly to do with the zero point -and actual (non-zero) point caibration. Please enlighten me if you can.

    As to your third graph, possibly we are oscillating within various cycles, but I have my doubts about the (global) data sets before 1950. Before 1950 theremometers were not re-calibrated and there was no automatic temperature recording. I think errors of about 0.5 degrees C are not only likely but very possible. When people went on leave, the job simply did not get done…..

  135. HenryP says:
    January 31, 2014 at 5:54 am

    “@richard
    Your first two graphs chosing a linear fit 1979-2014 is not a good choice,
    Surely you must know that the temperature is reacting-/due to a non-linear situation, i.e.
    http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/

    Me? I never use Linear Trends for anything. ‘Linear Trend’ = Tangent to the curve’ = ‘Flat Earth’.

    And fitting curves to data given the amount of noise in the signal is pointless. You might as well argue than an FT is fit for purpose in the same data :-)

  136. HenryP says:
    January 31, 2014 at 7:42 am

    “@richard
    You are confusing me.
    Are you with me or against me on my finding of
    “natural climate change” in the years ahead”

    I strongly believe that natural factors have been underestimated in the data to date.

    This does not mean that I can make predictions on what WILL happen, only what MAY.

  137. just a final comment
    The minimum flowrate of the Parana river was in 1953 or 1954, average..
    The maximum flowrate appears to be around 1990, average.
    There is no data before 1905, but it seems the curve came down from a maximum flow rate at around 1895.

    Now look here:
    There are good records of the flooding of the Nile, for example here:

    http://www.cyclesresearchinstitute.org/cycles-astronomy/arnold_theory_order.pdf

    to quote from the above paper:
    “A Weather Cycle as observed in the Nile Flood cycle, Max rain followed by Min rain, appears discernible with maximums at 1750, 1860, 1950 and minimums at 1670, 1800, 1900 and a minimum at 1990 predicted.

    the important point to note here is that the Nile flooding and Parana flow rate are out of sync,
    but the turning points correlate quite convincingly

    The Nile collects water 0-30 degrees south to north
    The Parana collects water -30 onwards north to south

    that is why I figured out (correctly) that the Parana river flows opposite the direction of the Nile

    why this happens is explained by warming and cooling periods

    so I could add the results of these measurements as another confirmation

    http://blogs.24.com/henryp/2013/04/29/the-climate-is-changing/

Comments are closed.