Guest Post by Willis Eschenbach
There’s a new paper over at IOP called “Forcing of the wintertime atmospheric circulation by the multidecadal fluctuations of the North Atlantic ocean”, by Y Peings and G Magnusdottir, hereinafter Peings2014. I was particularly interested in a couple of things they discuss in their abstract, which says (emphasis mine):
Abstract
The North Atlantic sea surface temperature exhibits fluctuations on the multidecadal time scale, a phenomenon known as the Atlantic Multidecadal Oscillation (AMO). This letter demonstrates that the multidecadal fluctuations of the wintertime North Atlantic Oscillation (NAO) are tied to the AMO, with an opposite-signed relationship between the polarities of the AMO and the NAO. Our statistical analyses suggest that the AMO signal precedes the NAO by 10–15 years with an interesting predictability window for decadal forecasting. The AMO footprint is also detected in the multidecadal variability of the intraseasonal weather regimes of the North Atlantic sector. This observational evidence is robust over the entire 20th century and it is supported by numerical experiments with an atmospheric global climate model.
Let me start with their claim that the AMO signal precedes the NAO by 10-15 years. Here’s the cross-correlation function for the monthly data, using the full 1856-2112 NOAA AMO and the Hurrell NAO data:
Figure 1. Cross-correlation of the full Hurrell NAO and the NOAA AMO, 1856-2012.
Hmmmm … why am I not finding the relationship between AMO and NAO they discuss? I mean, I see that the largest correlation is at zero, and there is a correlation out 15 years, but it’s all so tiny … what’s the problem?
Well, to start with, they are not using the regular AMO index, nor are they using the full year. Instead, here is their description:
A wintertime AMO index is constructed over the 1870– 2012 period using the HadISST dataset (Rayner et al 2003). The monthly SST anomalies are determined with respect to the 1981–2010 climatology, then the winter AMO index is computed by averaging the monthly SST anomalies over the North Atlantic [75W/5W; 0/70N] from December to March (DJFM). The global anomalies of SST are subtracted in order to remove the global warming trend and the tropical oceans influence, as suggested by Trenberth and Shea (2006). A Lanczos low-pass filter is applied to the time series to remove the high-frequency variability (21 total weights and a threshold of 10 years, with the end points reflected to avoid losing data).
Nor are they using the standard NAO, viz:
A decadal NAO index is computed from the 20th century reanalysis (20CR), which is available over 1871–2010 and is based on the assimilation of surface pressure observations only (Compo et al 2011). We use the station-based formulation based on the Stykkisholmur/Reykjavik and Lisbon anomalous sea-level pressure (SLP) difference (Hurrell et al 2003). The high-frequency fluctuations are removed from the NAO index using the same Lanczos filter as for the AMO index.
They are not using the standard AMO, nor the standard NAO, and most importantly, they are using a smoothed subset of the data for calculating the correlations. While using smoothed data is fine for display purposes, it is almost always a Very Bad Idea™ to do statistics and correlations using smoothed data, for reasons discussed below.
In addition, they are not using the full year. Instead, they are using a 4-month subset of the year, DJFM. While there is no inherent problem with doing this, it definitely messes with the statistics. If you want to find a significant correlation using a 4-month subset of the annual data, to achieve a significance level of 0.05 you need to find a four-month chunk with a p-value of one minus the twelfth root of 0.95, or 0.004 …
They go on to say that they have taken autocorrelation into account, viz (emphasis mine):
Figure 2 of Peing2014. ORIGINAL CAPTION: Lead–lag correlations (black curve) between the DJFM AMO and the DJFM decadal NAO indices over 1901–2010. The statistical significance of the correlation is depicted by the p-value (blue dashed curve), computed using a bootstrap method that takes into account auto-correlations in the time series. The 95% confidence level is indicated by the dashed black line.
I note that they are using p=0.05 as their significance level, despite the fact that they are using partial-year correlations.
Now it’s wonderful that they have used a “bootstrap method” to allow for auto-correlations … but that’s the sum total of the information that they give us about their whizbang bootstrap method. I generally use the method of Quenouille, viz:
I digitized their data to see if I could replicate their Figure 2. Figure 3 shows that result:
Figure 3. My emulation of Peing2014 Figure 2. Red shows p-values less than 0.05. “DJFM” is December-January-February-March. Auto-correlation is adjusted for by the method of Quenouille detailed above.
While the general shape is similar to Figure 2, there are a number of differences between what I find and what they find. Overall, the correlation “R” (black line) is slightly smaller. Their correlation has a max of about 0.55 and a minimum of -0.75, while mine has a max of 0.5 and a minimum of -.67. And while their results show R = +0.2 at -30, my results show R≈0. Not a lot of difference to be sure … but I’m using their data, so it should be exact.
Next, I find higher results for the p-value. Only the lags -2 to -7, and 23 to 27, are significant at the 0.05 level.
However, remember that they have used only part of the dataset, the values from December to March. Assuming that they searched all of the 4-month periods to settle finally on DJFM, that’s a dozen different samples that they have searched. And it may be more than a dozen, because I would assume that they would first look by quarters (three months). As a result, if you search that many situations, your odds of finding a result with a p-value of 0.5 purely by pure chance is quite large …
The net result is that if you look at twelve samples, you need to find a p-value of
<blockquote>1 – o.95<sup>1/12</sup> = 0.004</blockquote>
to be statistically significant at the 0.05 level … and that’s not happening anywhere in their graph.
Next, they do not find a correlation with AMO lagging the NAO, as in my results.
Next, there is an oddity, I might even say an impossibility, in their result. Look at the left hand side of Figure 2. Remember that as the lead gets longer and longer, we are using fewer and fewer datapoints in the calculation. In addition, as the lead gets longer, the correlation ( R ) is decreasing. Now, with fewer datapoints and a lower number of years, the p-value should steadily increase. You can see that in my graph—the maximum correlation and the minimum p-value are at about a two-year lead, and then as the lead heads out to 30 years, the R decreases, and the number of datapoints decreases.
But when both the correlation and the number of datapoints go down, the p-value has to increase … and while that is visible in my results, we don’t see anything like that in their results.
I am in mystery about the difference between my results and theirs. I know that the digitization is accurate to within the widths of the lines, here’s the proof of that, a screenshot of the digitization process of their Figure S3 …
Figure 4. Screenshot of the digitization process, showing that the errors are less than half a linewidth …
Finally, I have grave reservations about this general type of analysis. Basically, the AMO and the NAO represent subsections of the global temperature record. And as the name suggest, the NAO (North Atlantic Oscillation) is in itself a subset of the AMO (Atlantic Multidecadal Oscillation), representing the northern part.
As a result, I would be shocked if we did NOT find something akin to Figures 2 or 3 above. And in fact, a Monte Carlo analysis using proxy data with autocorrelation characteristics like the highly smoothed data that they are using easily generates the kind of curves shown above. That’s what happens when one dataset is a subset of another dataset, and it should not be a surprise to anyone.
In addition, such relationships are often not stable over time. For example, Figure 5 shows the cross-correlation for the AMO and NAO datasets (1901-2010), along with the identical cross-correlation calculations for the first halves (1901-1955) and for the second halves (1956-2010) of the two datasets. As you can see, the relationship is far from consistent, with cross-correlations of the two halves being different from each other, and both being different from the full dataset as well. This increases the chance that we are looking at a spurious correlation.
Figure 5. A comparison of the cross-correlation of the 30-year smoothed AMO and NAO datasets with the cross-correlations of the first halves and the second halves of the same two datasets.
As you recall, they claim in their Abstract (above) that their results are “robust over the entire 20th Century”, but their own data says otherwise.
CONCLUSIONS
In no particular order:
• Since the NAO is a subset of the AMO, we would expect cross-correlation between the two at a number of leads and lags … and that’s what we find. The authors seem to find that impressive, but their results show levels of significance and shapes of the cross-correlation that are quite commonplace when one dataset is a subset of another and the two datasets are heavily smoothed.
• They have made no attempt to adjust their significance levels to reflect the fact that they have chosen one of twelve or more possible monthly subsets of the data. This is a huge oversight, and one that puts all of their conclusions into doubt.
• I am unable to replicate the results of their cross-correlations (what they call “lead-lag” correlations above) of the smoothed 1901-2010 DJFM NAO and AMO.
• I am also unable to replicate the results of their “bootstrap” method of calculating the p-value, although that is undoubtedly related to the fact that they did not disclose their secret method …
• They neglected to include a description of one of the most important parts of their analysis, the calculation of the significance using a bootstrap method.
• The use of smoothed data in doing cross-correlation analyses is an abomination. Nature knows nothing of the 30-year average changes. Either there is significant cross-correlation between the two actual datasets or there is not. Using smoothed datasets can even generate totally spurious correlations. I give some examples here … and lest you think that I made up the idea that smoothing can lead to totally spurious correlations, it’s actually called the “Slutsky-Yule Effect”. Their use of smoothed datasets for cross-correlation alone is enough to entirely disqualify their study.
• As a result, were I a reviewer I could not agree with the publication of this study until those problems are solved.
A couple of things in closing. First, Science magazine recently decided to add a statistician to the peer-review panel for all studies … and as this paper clearly demonstrates, all journals might profitably do the same.
And second, the AMO and the PDO and the NAO are all parts of the global temperature record. As a result, using them to emulate the global temperature record as the authors have done can best be described as cheating. When someone does that, they are using part of what they are trying to predict as an explanatory variable …
And while (as the authors show) that is often a way to get impressive results, it’s like saying that you can predict the average temperature for tomorrow, as long as you already know tomorrow’s temperature from noon to 2pm. Which is not all that impressive, is it?
My best regards to all,
w.
De Maximis: If you disagree with me, and many do on any given day, please quote the exact words that you disagree with. That way, we can all understand exactly what your objection might be.
DATA AND CODE: The digitized 30-year smoothed datasets of the AMO and the NAO are here. The NOAA AMO data is online here, and the Hurrell NAO data is here. I haven’t posted the computer code. It is a pig’s breakfast, and as opposed to being “user-friendly”, it is actively user-aggressive … I may clean it up if I get time, but my life is a bit crazy at the moment, the data is there, and a cross-correlation is a very simple analysis that folks can do on their own.

NAO power spectrum from daily data since 1950, all year data.
http://climategrog.wordpress.com/?attachment_id=897
There is a peak around 60 years. Though this will not be too accurately determined it could be a tie in with N.Alt SST. Like W. says this would not be too surprising.
I have not looked at the phase relationship , but if this paper is suggesting 15 years, it may indicate one is related to the derivative of the other.
Atm pressure links to cloud cover so NAO varying with d/dt(SST) may be somewhat likely.
Indeed. Cross-correlation of NAO and d/dt( ICAODS SST ) provides a nice clean spectrum.
http://climategrog.wordpress.com/?attachment_id=898
The lunar signal one again strongly present. This looks like the 9.1 +/-0.1 again.
Despite their rather convoluted method , going straight for it with the least processed data seems to confirm a common circa 60y periodicity. I have not looked directly at the phase but as I guessed above, if one is related to the derivative of the other, a pi/2 lag would be expected. For ~60 years that would be 15y.
The presence of a strong 9 y signal and their rather chopped about data may account for their vague 10-15 year result.
If one is interested in exploring the relationship between a multidecadal phenomenon, such as the AMO, and another such variable, then simply computing the cross-correlation function of MONTHLY data will NOT produce indicative results. The high-variance, month-to-month random variablity of temperature will almost certainly produce negligible cross-correlation at all lags. Only by doing complete cross-spectrum analysis or by appropriate band-pass filtering can that relationship (if any) be revealed.
” Only by doing complete cross-spectrum analysis or by appropriate band-pass filtering can that relationship (if any) be revealed.”
Like I did just you mean?
I recall an earlier article where you were showing the perils of running correlations on smoothed data, it was about some other paper making the same mistake. It seems a lot of papers involve all sorts of random data fluctuations until they find something that appears to look interesting.
Greg:
You did NOT do a complete cross-spectrum analysis. It produces a complex-valued, frequency-dependent result C – iQ, where C is the co-spectrum C and Q is the quadrature spectrum Q; those spectra are obtained, respectively, by the cosine- and sine-transforms of the sample cross-covariance function between two variables. whose power densities are P1 and P2. What’s truly indicative in those results is the squared coherence (C^2 + Q^2)/(P1*P2) and the relative phase arctan(-Q/C) between the variables in each frequency band. What you present is far removed from that–and is plotted inappropriately as a function of period.
P.S. Of course, the sample cross-covariance function needs to be extended to lags long enough to encompass periods of interest (which Willis fails to do) and be properly windowed to avoid spurious Gibbs effects in the results.
Bit O/T. But my ex was a vulcan pilot with the nuclear strike force. He went on several missions to Norway, to do cold weather survival with experienced Norwegian instructors. Even ate whale meat. But one of the things they learned was if they survived a crash in the North Atlantic, they had little time to get into their survival dinghy. They would die in a very short time from hypothermia. So has the water heated up enough now to save people. No. Like the Titanic passengers.
Using subsets of a data set to predict future positions of an entire data set can have its uses. Consider that IBM is part of the DJI. If the IBM movement for one week predicted the DJI trend for the following week I’m pretty sure some people would find that useful.
“There is a peak around 60 years.”
We really need to get to the bottom of this sixty year thing.
Calculate this please someone….
http://en.wikipedia.org/wiki/File:Golfstrom.jpg
I agree with Juraj V. It is all just a natural multidecadal fluctuation. NAO is not a temperature index. It’s the difference of atmospheric pressure at sea level between the Icelandic low and the Azores high. Positive and negative PDO phases are correlated with increasing and decreasing temperature indices (-PDO cooling, +PDO warming).
http://www.woodfortrees.org/plot/jisao-pdo/normalise/plot/hadcrut4gl/mean:120/derivative/normalise
Willis
The significance of PDO, AMO and NAO being able to emulate global temperature record is they are natural cycles. It means climate change is driven by natural cycles. No small feat considering all the “science is settled” of AGW.
Greg Goodman says:
April 2, 2014 at 1:32 pm
Thanks, Greg. I was not talking about removing seasonal variations, sorry for the confusion. I was talking about the kind of egregious smoothing that they are doing … and sadly, that’s the common kind.
Since on average the SLP is a function of SST, measuring pressure is just another way of measuring the temperature. You can see this by comparing e.g. the SOI and ENSO3.4. Despite the fact that one is measuring pressure over a large area and the other is measuring temperature over a smaller area, their annual correlation is a staggering 0.8 …
Or you can just compare the NAO and the AMO directly. Despite one measuring pressure and one temperature, on an annual basis, they have a correlation of -0.43 …
Regards,
w.
Bloke down the pub says:
April 2, 2014 at 3:40 pm
See Figure 4, Bloke.
w.
1sky1: “What’s truly indicative in those results is the squared coherence (C^2 + Q^2)/(P1*P2) and the relative phase arctan(-Q/C) between the variables in each frequency band. What you present is far removed from that–and is plotted inappropriately as a function of period. ”
Thanks , it was not clear what you’d meant by “complete” cross-spectrum analysis.
Now isn’t squared coherence (C^2 + Q^2)/(P1*P2) just what I did ? I agree there would be additional information to be obtained from looking at the phase relationship.
A period plot is not “inappropriate” it’s a damned sight easier to read, since most people are discussing cycle periods and don’t think in “per year” units of frequency. If you want to start integrating the area under the curve to assess the power of a peak, you will want to be working in frequency. The periodogram is “appropriate” for the discussion of what periods are present and relative height of peaks.
Richard M says:
April 2, 2014 at 8:13 pm
That’s true, Richard, but that’s not what these folks are doing. They are predicting the DJI for next week by using the IBM price for next week … how you gonna do that?
w.
Dr. Strangelove says:
April 3, 2014 at 1:02 am
Thanks, Doc. You cannot assume causation simply because they are correlated. In fact, the AMO and the global temperature are correlated because the AMO is a part of the global temperature record. If
A = B + C
then A and B will be correlated. But that doesn’t mean that B is “driving” A as you say. Instead, it is simply a consequence of the fact that B is a part of A
w.
Since the NAO is a subset of the AMO
No it’s not. AMO is the integral of NAO (long term)
http://virakkraft.com/NAO-AMO
High NAO gives more sunshine hours to northern Europe so it makes sense.
http://virakkraft.com/Sunshine-duration.pptx
Willis, AMO is not a part of the SH temperature indices and they still correlate.
http://www.woodfortrees.org/plot/esrl-amo/plot/hadcrut4sh/detrend:0.8/plot/hadcrut4nh/trend/detrend:0.8
The mode of variability known as AMO is actually global and the North Atlantic SST is part of that. The secular trend in this global oscillation is another longer quasi-oscillation.
For those who have a built-in reciprocal button here is the NAO vs N.Atl SST cross-spectrum plotted against frequency.
http://climategrog.wordpress.com/?attachment_id=899
lgl says:
No it’s not. AMO is the integral of NAO (long term)
http://virakkraft.com/NAO-AMO
Thanks, that is the guess I was working on doing the cross-correlation using d/dt(SST).
This is like the d/dt(CO2) vs SST thing all over again. Unless you are looking at the correct form of the variables you will not get the right relationship. When the paper found a lag of 15 and a cycle of 60, I saw the obvious indication of a pi/2 lag of two quantities in quadrature.
Integrating NAO or diffing SST amounts to the same thing (except the latter is better for FFT etc).
Since lgl would like a complete analysis, here’s the phase plot too:
http://climategrog.wordpress.com/?attachment_id=900
The circa 60y and 9y peaks are close to anti-phase, when comparing NAO to d/dt(SST), the strongest peak (on the inter-annual scale) at 2.68y is close to being in-phase.
spectra were calculated using d/dt(N_Atl_SST) lagging NAO as positive correlation.
Just a note on visualising the phase plot: since phase cycles in 2*pi radians, +3.142 = -3.142 , so it’s not jumping, it’s cycling evenly around in phase. The upper limit and lower limit represent the same phase lag: the two variables being in anti-phase.
Greg
Integrating NAO or diffing SST amounts to the same thing
It’s perhaps a bit confucing but diffing the AMO gives something similar to ENSO, not the NAO, so I guess ENSO is the dominant driver on short term and NAO on longer term.
“Abstract: Our statistical analyses suggest that the AMO signal precedes the NAO by 10–15 years with an interesting predictability window for decadal forecasting.”
The AMO signal appears to proceed the ENSO by at least ~74 years, and with a lot less manipulation of the data.
http://wattsupwiththat.com/2014/02/13/a-relationship-between-sea-ice-anomalies-ssts-and-the-enso/