Guest Post by Willis Eschenbach
I came across a curious graph and claim today in a peer-reviewed scientific paper. Here’s the graph relating sunspots and the change in sea level:
And here is the claim about the graph:
Sea level change and solar activity
A stronger effect related to solar cycles is seen in Fig. 2, where the yearly averaged sunspot numbers are plotted together with the yearly change in coastal sea level (Holgate, 2007). The sea level rates are calculated from nine distributed tidal gauges with long records, which were compared with a larger set of data from 177 stations available in the last part of the century. In most of the century the sea level varied in phase with the solar activity, with the Sun leading the ocean, but in the beginning of the century they were in opposite phases, and during SC17 and 19 the sea level increased before the solar activity.
Let me see if I have this straight. At the start of the record, sunspots and sea level moved in opposite directions. Then for most of the time they were in phase. In both those cases, sunspots were leading sea level, suggesting the possibility that sunspots might affect sea level … except in opposite directions at different times. And in addition, in about 20% of the data, the sea level moved first, followed by the sunspots, suggesting the possibility that at times, the sea level might affect the number of sunspots …
Now, when I see a claim like that, after I get done laughing, I look around for some numerical measure of how similar the two series actually are. This is usually the “R2” (R squared) value, which varies from zero (no relationship) to 1 (they always move proportionately). Accompanying this R2 measure there is usually a “p-value”. The p-value measures how likely it is that we’re just seeing random variations. In other words, the p-value is the odds that the outcome has occurred by chance. A p-value of 0.05, for example, means that the odds are one in twenty that it’s a random occurrence.
So … what did the author of the paper put forwards as the R2 and p-value for this relationship?
Sad to relate, that part of the analysis seems to have slipped his mind. He doesn’t give us any guess as to how correlated the two series are, or whether we’re just looking at a random relationship.
So I thought, well, I’ll just get his data and measure the relationship myself. However, despite the journal’s policy requiring public archiving of the data necessary for replication, as is too common these days there was no public data, no code, and not even a Supplementary Online Information.
However, years of messing around with recalcitrant climate scientists has shown me that digitizing data is both fast and easy, so I simply digitized the graph of the data so I could analyze it. It’s quite accurate when done carefully.
And what did I find? Well, the R2 between sunspots and sea level is a mere 0.13, very little relationship. And even worse, the p-value of the relationship is 0.08 … sorry, no cigar. There is no statistically significant relationship between the two. In part this is because both datasets are so highly auto-correlated (~0.8 for both), and in part it’s because … well, it’s because as near as we can tell, sunspots [or whatever sunspots are a proxy for] don’t affect the sea level.
My conclusions from this, in no particular order, are:
• If this is the author’s “stronger effect related to solar cycles”, I’m not gonna worry about his weaker effect.
• This is not science in any sense of the word. There is no data. There is no code. There is no mathematical analysis of any kind, just bald assertions of a “stronger” relationship.
• Seems to me the idea that sunspots rule sea level would be pretty much scuttled by sunspot cycles 17 and 19 where the sea level moves first and sunspots follow … as well as by the phase reversal in the early data. At a minimum, you’d have to explain those large anomalies to make the case for a relationship. However, the author makes no effort to do so.
• The reviewers, as is far too often the case these days, were asleep at the switch. This study needs serious revision and buttressing to meet even the most minimal scientific standards.
• The editor bears responsibility as well, because the study is not replicable without the data as used, and the editor has not required the author to archive the data.
So … why am I bothering with a case of pseudo-science that is so easy to refute?
Because it is one of the papers in the Special Issue of the Copernicus journal, Pattern Recognition in Physics … and by no means the worst of the lot. There has been much disturbance in the farce lately regarding the journal being shut down, with many people saying that it was closed for political reasons. And perhaps that is the case.
However, if I ran Copernicus, I would have shut the journal down myself, but not for political reasons. I’d have closed it as soon as possible, for both scientific and business reasons.
I’d have shut it for scientific reasons because as we see in this example, peer-review was absent, the editorial actions were laughable, the authors reviewed each others papers, and the result was lots of handwaving and very little science.
And I’d have shut it for business reasons because Copernicus, as a publisher of scientific journals, cannot afford to become known as a place where reviewers don’t review and editors don’t edit. It would make them the laughing stock of the journal world, and being the butt of that kind of joke is something that no journal publisher can survive.
To me, it’s a huge tragedy, for two reasons. One is that I and other skeptical researchers get tarred with the same brush. The media commentary never says “a bunch of fringe pseudo-scientists” brought the journal down. No, it’s “climate skeptics” who get the blame, with no distinctions made despite the fact that we’ve falsified some of the claims of the Special Issue authors here on WUWT.
The other reason it’s a tragedy is that they were offered an unparalleled opportunity, the control of special issue of a reputable journal. I would give much to have the chance that they had. And they simply threw that away with nepotistic reviewing, inept editorship, wildly overblown claims, and a wholesale lack of science.
It’s a tragedy because you can be sure that if I, or many other skeptical researchers, got the chance to shape such a special issue, we wouldn’t give the publisher any reason to be unhappy with the quality of the peer-review, the strength of the editorship, or the scientific quality of the papers. The Copernicus folks might not like the conclusions, but they would be well researched, cited, and supported, with all data and code made public.
Ah, well … sic transit gloria monday, it’s already tuesday, and the struggle continues …
w.
PS—Based on … well, I’m not exactly sure what he’s basing it on, but the author says in the abstract:
The recent global warming may be interpreted as a rising branch of a millennium cycle, identified in ice cores and sediments and also recorded in history. This cycle peaks in the second half of this century, and then a 500 yr cooling trend will start.
Glad that’s settled. I was concerned about the next half millennium … you see what I mean about the absence of science in the Special Edition.
PPS—The usual request. I can defend my own words. I can’t defend your interpretation of my words. If you disagree with something I or anyone has written, please quote the exact words that you object to, and then tell us your objections. It prevents a host of misunderstandings, and it makes it clear just what you think is wrong, and why.

Mike: “I ask because I am curious what the correlation would have looked like if one excluded the single out-of-phase period ”
I don’t think it’s necessary to question the data or think it’s just one odd-ball cycle. There are other factors at play in climate (as is detailed in the paper !!) .
It maybe possible to see something from a cross-correlation of the data shown but just annual points is not going to be very informative.
The point is that this kind of slipping in and out of phase is _exactly_ what you get when you have two close frequencies ( like 9.1 and 11 , for ex.) As Ian says , all the low correlation value tells is that saying SSN is the primary and sole cause of MSL or temp change is not correct.
richardscourtney,
R^2 tests the predictive value of a model. Sunspots are not a model. They are ‘maybe’ a single factor in a model containing many factors. Correlation tells us if something is a likely factor and how important a factor it probably is. Sunspots are highly correlated and are therefore probably an important factor, but not the complete story for predictive purposes.
Do you disagree? Here are some examples.
Smoking is correlated with lung cancer. Smoking is not predictive though. Using smoking as the only factor for lung cancer would give us a poor R^2 value. Smoking is highly correlated with lung cancer though. This tells us that while not the only factor, it’s an important one.
The temperature in Edmonton is correlated with rotation of the Earth (whether the Sun is shining or not), however it rotation is not the only factor. Tilt is also important. If we build a model of temperatures that uses ‘only’ rotation, we will have a poor R^2 value. It doesn’t do a very good job of predicting absolute temperatures in Edmonton as daytime/nighttime temperatures only vary a small amount compared to seasonal temperature changes. However it has really high correlation. That tells us that Earth’s rotation is probably an important factor, just not the only factor and that we need to take other things into account.
I’m not a ‘sunspot guy’ and have no position on it, but when I see sunspot numbers correlation highly with something (correlation, not R^2) then I acknowledge that there is almost definitely something there.
Tom: “Did you mean the no data should ever be “corrected” (whatever that correction means) or are you referring to a specific reason that the Holgate data should not be corrected for the effect of Pinatubo in the early ’90s?”
I mean there is far too much speculative “bais correction” going on. If you make an uncertain “correction” to the data you ADD to the uncertainty, you don’t reduce it. In the case of volanism we do not know accurately how much radiative change is produced and more importantly we don’t the climate response to that change.
Much of the time it would be more appropriate to recognise that there are uncertainties and live with them. The current state of play is that we spend most of the effort analysing the result of corrections as much as the data.
Most these data sets seem to be controlled by groups with a “message” who are less than objective about how they alter the data.
What currently gets called “global mean sea level” is some phantom hovering slightly above the waves and getting more so each year.
It’s getting damn hard to find any true climate data to analyse.
All these people saying the correlation is visibly obvious – the first thing that stood out to me when looking at the graphs was the LACK of correlation. In some spots, the peaks coincide, in others they’re opposite. To me, the lack of correlation is quite clear. Are we looking at different graphs?
Willis is a little too smug for my liking. If it helps his ego.
What does WUWT fear? Turned into a mob.
Mike Rossander says:
January 22, 2014 at 1:50 pm
………….
I don’t know much about the SL measurements and have a view that are most likely within the error margin and hence taken on annual bases irrelevant.
However cycle 17 is a particularly important one, it shows what happens to the Earth’s magnetic field when the solar intra-cycle oscillations are close to the Earth’s orbital period.
http://www.vukcevic.talktalk.net/SC17.htm
TonyG says:
January 22, 2014 at 2:45 pm
All these people saying the correlation is visibly obvious – the first thing that stood out to me when looking at the graphs was the LACK of correlation. In some spots, the peaks coincide, in others they’re opposite. To me, the lack of correlation is quite clear. Are we looking at different graphs?
######################
ah ya,
worse than that they used 9 tidal guages? which 9? what happens of you pick a different 9 than the original author Holgate.
worse than that the author might have stole the idea, inlcuding the bits about GCRs
http://climateaudit.org/2007/02/11/holgate-on-sea-level
Ian Schumacher:
At January 22, 2014 at 2:40 pm you ask me
I strongly disagree.
The coefficient of determination (r^2) does NOT test “the predictive value of a model” except within the limits of a model which is correlation between two variables.
When two data sets are correlated then the r^2 indicates the proportion of the variance of one variable that is predictable from the other variable. Hence, it is a measure of the certainty of a prediction of one variable from the other as indicated by their correlation.
For example, if r^2 is 0.850, then 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained.
Putting that in plain language, if the r^2 is low then confidence that there is a useful correlation is low.
In his article above, Willis says
In other words, almost all the variation between the two parameters is NOT explained by a correlation between them.
Of course, that does not disprove a possibility that they are not each related to a third parameter, but the paper does not suggest any such third parameter. All one can say is that the correlation between the two considered parameters is so poor that the correlation is NOT a model which enables one of the two parameters to be usefully predicted from the other.
Richard
@Steven Mosher
January 22, 2014 at 3:02 pm
This was Steve McIntyres comment:
“…Doesn’t it look like there’s something like an 11-year cycle in this? Remind you of anything? I know that it’s a bit of a mug’s game trying to identify solar cycles, but here’s a plot of sun spot numbers in the same period. The maxima and minima of the solar cycles seem to match the fluctuations in sea level rise rather uncannily. While the resemblance is impressionistic (I don’t have a digital version of Holgate’s series), offhand, I can’t think of any two climate series with better decadal matching…”
http://climateaudit.org/2007/02/11/holgate-on-sea-level/
It’s late, past my bed time, I’ve forgotten but I did a graph on sea level ‘correlation’ while ago, so here it is:
http://www.vukcevic.talktalk.net/SeaLevel.htm
yes Manfred, trust nobodies eye’s. That’s why the experts on the thread call for the data to do the calculation. Note too that nobody dug down to check the data.
Now, If I selected 9 stations out of 177 and proved to you that UHI wasnt real.. wouldnt the FIRST thing youd do is ask for the data?
If I told mcintyre that 9 tree rings out of 177 gave me a flat MWP.. what do you think he would do.
Its simple. Before you let your eye’s decieve you. Get the data. Check the data. Then get the code. Check the code. Otherwise even the best eyes are lead astray
To DirkH “What does WUWT fear? Turned into a mob.” And the earlier commenter who made the analogy to circling wagons and shooting arrows outwards not inwards. For me this should not be a matter of taking one ‘side’ or the other – or of finding evidence to support the paradigms of one’s own ‘side’. Sceptism is for individuals – not ‘sides’ or teams (or any grouping that could refer to themselves as ‘we’ or ‘us’). So don’t expect me to join-up with some great ’cause’.
Above all, I want to see good science – of the Feynmann “bending-over backwards to disprove one’s own theory” type. Not the grasping at any old straw to support one’s own theory type of junk science that has become all too prevalent (in lots of different subjects). The sort of junk science that is trotted-out in support of the CAGW agenda is a pet-hate of mine – which is why I lurk at blogs like this one.
Unfortunately, the papers Willis and others are criticising fall far short of Feynmann’s ideals – to put in mildly – Willis has been too kind in my opinion.
For those interested: a really quick power spectrum from cross-correl, Jevrejeva d/dt(MSL) and daily sunspot area. No bells whistles or other decoration. 😉
http://i40.tinypic.com/206j7th.png
By far the largest peak at 5.405 years. Ubiquitous 9.03 present but small
20.27 will be rather uncertain
It’s only annual resolution CC but peak correlation of .335 at one year lag. Guessing by form, actual peak near 0.5 years.
5.4= 10.8 / 2 looks like the main solar linkage.
Interestingly the is damn near the period I extracted from Arctic sea ice about a year ago that got Tamino in such a panic he spent a week trying to diss my efforts before getting is a sulk and slamming the door shut.
http://climategrog.wordpress.com/2013/03/11/open-mind-or-cowardly-bigot/
It is also quite close to the 5.8 semi cosine that I found as a repetitive pattern this year when evaluating decadal variation in Arctic Sea Ice.
http://climategrog.wordpress.com/2013/09/16/on-identifying-inter-decadal-variation-in-nh-sea-ice/
@Steve Mosher,
eyes may deceive you, but also an r-value or similarly Persons’ correlation coefficient if you trust them without considering their limitations.
These are just widely used measures of correlation, but not the only ones.
A single phase shift in one data set can turn an r value from one to zero. And there may have been phase shifts in above data set. Or if you have 2 identical curves of shorts pulses, a small jitter may bring the r value from one to zero. And there may have been dating issues working that way.
For such curves with multiple ups and downs (like above or Bond et al 2001 etc), I think a better measure of correlation would be how often ups and downs in one curve occur in the other as well, plus compare similarity of the shapes of each 11 year cycle, what essantially is trying to quantifiy what your eyes tell you.
I did a very interesting correlation in regards to PRP,
Why was Roger chosen as an “editor” and “reviewer” [with no relevant qualifications] and what do all the authors of the “Special Edition” have in common? It’s quite special…
They are all favorite discussion topics at Tallbloke’s Talkshop (ordered by word frequency);
1. Wilson (2,380)
2. Scafetta (1,180)
3. Salvador (854)
4. Jelbring (812)
5. Morner (222)
6. Charvatova (213)
7. Solheim (59)
Oops. Too late at night to be doing this sort of crap. Slip up in collating data missed out several years and screwed up the CC calcs.
Here’s a corrected SPD. Peak at 10.4 as expected max max CC at 0.68 years is relatively low at about 0.18
Estimate signif at 0.138 with 127 pts but way past bedtime, so take with a pinch.
http://i44.tinypic.com/sffddu.png
Hey Pops, you should have done a control for significance against red noise. 😉
https://www.google.com/search?q=Solheim+site:tallbloke.wordpress.com#q=Red+noise+site%3Atallbloke.wordpress.com (118)
Even I score 135 and I’m banned !!
You do have a point though, the review process does look rather incestuous.
Here’s the cross_correlation fn before spectral analysis. 79 year period corresponds to the phase crisis in 1920s.
http://tinypic.com/view.php?pic=2hojthe&s=5#.UuB3_aJw1uA
Now quick calculation of what would mix with 10.45 to cause phase lag of one year in 79 …..
9.23 years.
So that’s why there’s a low corr coeff. Like I’ve been saying for some time it seems you can’t refute the presence of a periodicity just by trivial regression analysis.
This was a quick hack so that 9.2 could be N Scafetta’s 9.1 , my 9.05 or maybe 186/2 = 9.3
In view of other evidence I suspect it’s 9.05
Willis,
I’m a fan. Your scientific and personal essays have been educational and lightened my days.
Unfortunately, your latest work was the proverbial straw on a load from many, many essays from many sources. But . . .
. . . I must, at long last, say something about an endemic statistical solecism. (I apologize for wasting band width if someone beat me to it — I didn’t read all the comments.)
Hypothesis testing is a concept most undergraduates and a surprising fraction of Ph.D.s in fields that rely on non-experimental data find . . . perplexing. I have no idea how many times, in thirty some years, I said something like the following to one, a few, or a room full of (usually) undergraduate students (pedantry alert) :
The null hypothesis that a human’s body weight and donut consumption (say) are unrelated (null: r = 0, alternate: r 0) might be tested by comparing the:
a) calculated and critical values of a correlation coefficient, r, given the sample size and desired confidence interval (0.95 by convention in fields that depend on non-experimental data) or
b) maximum acceptable risk to a decision maker of rejecting a true null hypothesis (alpha, chosen carefully (?) in advance, 0.05 by convention in fields . . . . The sum of the confidence interval and alpha is 1.0. Always.) and the risk given the sample (the glorious p-value).
The null hypothesis can be rejected if the:
a) calculated correlation coefficient (r-hat, a caret over lower case r for estimated) is not smaller than the correctly chosen critical value (sometimes r*) or
b) p (a calculated value) is not larger than alpha (the maximum acceptable risk of rejecting a true null hypothesis, a carefully (?) chosen value).
Both methods yield the same result. Always.
in regression, of which Fourier analysis is an example, (capital) R-square reports the ratio of explained to total variation in an estimated equation, ignoring the effect (for example) of serial correlation in time series data and the inclusion of explanatory (independent) variables with obscure theoretical connections to the response (dependent) variable, both of which inflate r-square.
Iff (if and only if) there is exactly one explanatory variable, r = sqrt(R-square).
(pedantry off)
You report p = 0.08, so, if the author tested an hypothesis whose alpha was the conventional 0.05 with a test statistic (not R-square), the author, you, and your readers are unable to reject the hypothesis that there is no relationship between the dependent and set of independent variable.
I know of no hypothesis test based on R-square. One might test the null hypothesis that none of the estimated coefficients are “significant” with the F-test and related p-value, but reading a sentence that combines an R-square and a p-value makes my head . . . never mind.
Thank you for the time, energy, diligence, . . . your work shows. I look forward to reading many more reports of your always interesting, always cogent, analyses and anything else you may post. All the best!
Willis, will you please post the digitized data you took? Thanks
The other reason it’s a tragedy is that they were offered an unparalleled opportunity, the control of special issue of a reputable journal. I would give much to have the chance that they had. And they simply threw that away with nepotistic reviewing, inept editorship, wildly overblown claims, and a wholesale lack of science.
I think you have overreacted.
What is the alternative hypothesis to H0: R^2 = 0? How about H1: R^2 >= 0.10? For something that is influenced by many different agents, R^2 = 0.10 is too large to ignore, even though it is sometimes (in some behavioral research) customarily called “small”. ( Most of the risk factors for high blood pressure, such as obstructive sleep apnea, have R^2 <= 0.10, but they are important in a large population.) Then the probability of obtaining R^2 = 0.15 is at least 0.32, making the outcome that was reported 4 times as likely given H1 than H0. Say for the sake of argument that you were willing to place prior odds on H1 at 1:4, 0.2 probability that H1 is true, 0.8 probability that H0 is true? Then by Bayes' rule the posterior odds would be 1:1, or H0 and H1 equally likely. Even though p = 0.08 is not strong evidence, it does support H1 more than H0.
(That's a sketch. A full development would need some more detail, including explicit examination of conditional independence.)
Null hypothesis testing with conventional levels of statistical significance is not the only important method of statistical inference.
The full paper is worth reading. It's too bad that the authors and editors have not made the exact data and code used available publicly, but that does not make the paper a waste of reading time. If you an I ran the scientific publishing industry then data and code would have to be presented as a condition of review, not merely of publication, but we don’t, and in the mean time we have to work with people who don’t agree with us; and that includes readers on this forum, most likely.
vukcevic says:
January 22, 2014 at 2:25 pm
Curve fitting is my speciality, it’s great fun.
But it ain’t science, and should not be peddled as such.
Ah, nuts. The italics are supposed to end before “I think you have overreacted.”
and the word “review” is to be italicized in “as a condition of review“.
[Fixed. -w.]