NAO and Then

Guest Post by Willis Eschenbach

Anthony recently highlighted a new study which purports to find that the North Atlantic Oscillation (NAO) is synchronized to the fluctuations in solar activity. The study is entitled “Solar forcing synchronizes decadal North Atlantic climate variability”. The “North Atlantic Oscillation” (NAO) refers to the phenomenon that the temperatures (and hence the air pressures) of the northern and southern regions of the North Atlantic oscillate back and forth in opposition to each other, with first the northern part and then the southern part being warmer (lower pressure) and then cooler (higher pressure) than average. The relative swings are measured by an index called the North Atlantic Oscillation Index (NAOI). The authors’ contention is that the sun acts to synchronize the timing of these swings to the timing of the solar fluctuations.

Their money graph is their Figure 2:

nao figure 2aFigure 1. Figure 2a from the study, showing the purported correspondence between solar variations (gray shaded areas at bottom) and the North Atlantic Oscillation Index (NAOI). Original Caption: (a) Time series of 9–13-year band-pass filtered NAO index for the NO_SOL [no solar input] (solid thin) and SOL [solar input] (solid thick) experiments, and the F10.7cm solar radio flux (dashed black). Red and blue dots define the indices used for NAO-based composite differences at lag 0 (see the Methods section). For each solar cycle, maximum are marked by vertical solid lines. 

From their figure, it is immediately apparent that they are NOT looking at the real world. They are not talking about the Earth. They are not discussing the actual North Atlantic Oscillation Index nor the actual f10.7 index. Instead, their figures are for ModelEarth exclusively. As the authors state but do not over-emphasize, neither the inputs (“F10.7”) to the computer model nor the outputs of the computer model (“Filtered NAOI”) are real—they are figments of either the modelers’ or the model’s imaginations, understandings, and misapprehensions …

The confusion is exacerbated by the all-too-frequent computer modelers’ misuse of the names of real observations (e.g. “NAOI”) to refer to what is not the NAOI at all, but is only the output of a climate model. Be clear that I am not accusing anyone of deception. I am saying that the usual terminology style of the modelers makes little distinction between real and modeled elements, with both often being called by the same name, and this mis-labeling does not further communication.

This brings me to my first objection to this study, which is not to the use of climate models per se. Such models have some uses. The problem is more subtle than that. The difficulty is that the outputs of climate models, including the model used in this study, are known to be linear or semi-linear transformations of the inputs to those climate models. See e.g.  Kiehls seminal work, Twentieth century climate model response and sensitivity as well as my posts here and here.

As a result, we should not be surprised that if we include solar forcings as inputs to a climate model, we will find various echoes of the solar forcing in the model results … but anyone who thinks that these cyclical results necessarily mean something about the real world is sadly mistaken. All that such a result means is that climate models, despite their apparent complexity, function as semi-linear transformation machines that mechanically grind the input up and turn it into output, and that if you have cyclical input, you’ll be quite likely to get cyclical output … but only in ModelEarth. The real Earth is nowhere near that linear or that simple.

My second objection to their study is, why on earth would you use a climate model with made-up “solar forcing” to obtain modeled “Filtered NAOI” results when we have perfectly good observational data for both the solar variations and the NAO Index??? Why not start by analyzing the real Earth before moving on to ModelEarth? The Hurrell principal component NAOI observational dataset since 1899 is shown in Figure 2a. I’ve used the principal component NAO Index rather than the station index because the PC index is used by the authors of the study.

nao since 1900Figure 2a. Hurrell principal component based North Atlantic Oscillation Index. Red line shows the same data with a 9-13-year bandpass filter applied. DATA SOURCE 

Here you can see the importance of using a longer record. Their results shown in Figure 1 above start in 1960, a time of relative strength in the 9-13-year band (red line above). But for the sixty years before that, there was little strength in the same 9-13-year band. This kind of appearance and disappearance of apparent cycles, which is quite common in climate datasets, indicates that they do not represent a real persisting underlying cycle.

Which brings me to my next objection. This is that comparing a variable 11-year solar cycle to a 9-13-year bandpass filtered NAOI dataset seemed to me like it would frequently look like it was significant when it wasn’t significant at all. In other words, from looking at the data I thought that similar 9-13-year bandpassed red noise would show much the same type of pattern in the 9-13-year band.

To test this, I used simple “ARMA” red noise. ARMA stands for “Auto-Regressive, Moving Average”. I first calculated the lag-1 AR and MA components of the DJF NAOI data. These turn out to be AR ≈ 0.4, and MA ≈ – 0.2. This combination of a positive AR value and a negative MA value is quite common in climate datasets.

Then I generated random ARMA “pseudo-data” of the same length as the DJF NAOI data (116 years), and applied the 9-13-year bandpass filter to each pseudo-dataset. Figure 2b shows four typical random red-noise pseudo-data results:

nao red noise bandpassedFigure 2b. As in Figure 2a, but using ARMA red-noise random pseudo-data. Heavy red/black lines show the result of applying the 9-13-year bandpass filter to the pseudo-data.

As I suspected, red noise datasets of the same ARMA structure as the DJF NAOI data generally show a strong signal in the 9-13-year range. This signal typically varies in strength across the length of the pseudo-datasets. However, given that these are random red-noise datasets, it is obvious that such strong signals in the 9-13-year range are meaningless.

So the signal seen in the actual DJF NAOI data is by no means unusual … and in truth, well … I fear to admit that I’ve snuck the actual DJF NAOI in as the lower left panel in Figure 2b … bad, bad dog. But comparing that with the upper left panel fo the same Figure illustrates my point quite clearly. Random red-noise data contains what appears to be a signal in the 9-13-year range … but it’s most likely nothing but an artifact, because it is indistinguishable from the red-noise results.

My next objection to the study is that they have used the “f10.7″ solar index as a measure of the sun’s activity. This is the strength of the solar radio flux at the 10.7 cm wavelength, and it is a perfectly valid observational measure to use. However, in both phase and amplitude, the f10.7 index runs right in lock-step with the sunspot numbers. Here’s NASA’s view of a half-century of both datasets:

nasa sunspot and 10.7 indicesFigure 3. Monthly sunspot numbers (upper panel) and monthly f10.7 cm radio wave flux index (lower panel). SOURCE

As you can see, using one or the other makes no practical difference at the level of analysis done by the authors. The difficulty is that the f10.7 data is short, whereas we have good sunspot data much further back in time than we have f10.7 data … so why not use the sunspot data?

My next objection to the study is that it seems the authors haven’t heard of Bonferroni and his correction. If you flip a group of 8 coins once and they come up all heads, that’s very unusual. But if you throw the same group of 8 coins a hundred times, somewhere in there you’ll likely come up with eight heads.

In other words, how unusual something is depends on how many places you’ve looked for it. If you look long enough for even the rarest relationship, you’ll likely find it … but that does not mean that the find is statistically significant.

In this case, the problem is that they are only using the winter-time (DJF) value of the NAOI. To get to that point, however, they must have tried the annual NAOI, as well as the other seasons, and found them wanting. If the other NAOI results were statistically significant and thus interesting, they would have reported them … but they didn’t. This means that they’ve looked in five places to get their results—the annual data as well the four seasons individually. And this in turn means that to claim significance for their find, they need to show somethings which is more rare than if they had just looked in one place.

The “Bonferroni correction” is a rough-and-ready way to calculate the effect of looking in more places or conducting more trials. The correction says that whatever p-value you consider significant, say 0.05, you need to divide that p-value by the number of trials to give the equivalent p-value needed for true significance. So if you have 5 trials, or five places you’ve looked, or five flips of 8 coins, at that point to claim statistical significance you need to find something significant at the 0.05 / 5 level, which is a p-value of less than 0.01 … and in climate, that’s a hard ask.

So those are my objections to the way they’ve gone about trying to answer the question.

Let me move on from that to how I’d analyze the data. Here’s how I’d go about answering the same question, which was, is there a solar component to the DJF North Atlantic Oscillation?

We can investigate this in a few ways. One is by the use of “cross-correlation”. This looks at the correlation of the two datasets (solar fluctuations and NAO Index) at a variety of lags.

nao cross-correlation annual sunspotsFigure 4. Cross-correlation, NAO index and sunspots. NAO index data source as above. Sunspot data source.

As you can see, the maximum short-lag positive correlation is with the NAO data lagging the sunspots by about 2-3 years. But the fact that the absolute correlation is largest with the NAO data leading the sunspots (negative values of lag) by two years is a huge red flag, because it is not possible that the NAO is influencing the sun. This indicates we’re not looking at a real causal relationship. Another problem is the small correlation values. The r^2 of the two-year-lagged data is only 0.03, and the p-value is 0.07 (not significant). And this is without accounting for the cyclical nature of the sunspot data, which will show alternating positive and negative correlations of the type shown above even with random “red-noise” data. Taken in combination, these indicate that there is very little relationship of any kind between the two datasets, causal or otherwise.

Next, we can search for any relationship between the solar cycle and the DJF NAOI using Fourier analysis. To begin with, here is the periodogram of the annual sunspot data. As is my habit, I first calculate the periodogram of the full dataset. Then I divide the dataset in two, and calculate the periodograms of the two halves individually. This lets me see if the cycles are present in both halves of the data, to help establish if they are real or are only transient fluctuations. Here is that result.

nao sunspots periodogramFigure 5. Periodograms of the full sunspot dataset (black), and of the first and second halves of the data. 

As you can see, the three periodograms are quite similar, showing that we are looking at a real, persistent (albeit variable) cycle in the sunspot data. This is true even for the ~ 5-year cycle, as it shows up in all three analyses.

However, the situation is very different with the DJF NAOI data.

nao periodogramFigure 6. Periodograms of the full North Atlantic Oscillation Index data, and of the first and second halves of the data. 

Unlike the sunspot data, the three NAOI periodograms are all very different. There are no cycles common to all three. We can also see the lack of strength in the 9-13-year region in the first half compared with the second half. All of this is another clear indication that there is no strong persistent cycle in the NAOI data in the 9-13-year range, whether of solar or any other origin. In other words, the DJF NAOI is NOT synchronized to the solar cycle as the authors claim.

Finally, we can investigate their claim that the variations in solar input are driving the DJF NAOI into synchronicity by looking at what is called “Granger causality”. An occurrence “A” is said to “Granger-cause” occurrence “B” if we can predict B better by using the history of both A and B than we can predict B by using just the history of B alone. Here is the Granger test for the sunspots and the DJF NAOI:

> grangertest(Sunspots,DJFNAOI)

Granger causality test

Model 1: DJFNAOI ~ Lags(DJFNAOI, 1:1) + Lags(Sunspots, 1:1)

Model 2: DJFNAOI ~ Lags(DJFNAOI, 1:1)

  Res.Df Df      F Pr(>F)

1    112                 

2    113 -1 0.8749 0.3516

The Granger causality test looks at two models. One model (Model 2) tries to predict the DJF NAOI by looking just at the previous year’s DJF NAOI. The other model (Model 1) includes the previous year’s sunspot information as an additional independent variable, to see if the sunspot information helps to predict the DJF NAOI.

The result of the Granger test (p-value of 0.35) does not allow us to reject the null hypothesis, which is that there is no causal relationship between sunspots and the NAO Index. It shows that adding solar fluctuation data does not improve the predictability of the NAOI. And the same is true if we include more years of historical solar data as independent variables, e.g.:

> grangertest(Sunspots,DJFNAOI,order = 2)

Granger causality test

Model 1: DJFNAOI ~ Lags(DJFNAOI, 1:2) + Lags(Sunspots, 1:2)

Model 2: DJFNAOI ~ Lags(DJFNAOI, 1:2)

  Res.Df Df      F Pr(>F)

1    109                 

2    111 -2 0.4319 0.6504

This is even worse, with a p-value of 0.65. The solar fluctuation data simply doesn’t help in predicting the future NAOI, so again we cannot reject the null hypothesis that there is no causal relationship between solar fluctuations and the North Atlantic Oscillation.

CONCLUSIONS

If you use a cyclical forcing as input to a climate model, do not be surprised if you find evidence of that cycle in the model’s output … it is to be expected, but it doesn’t mean anything about the real world.

The cross-correlation of a century’s worth of data shows that relationship between the sunspots and the DJF NAOI is not statistically significant at any lag, and it does not indicate any causal relationship between solar fluctuations and the North Atlantic Oscillation

The periodogram of the NAOI does not reveal any consistent cycles, whether from solar fluctuations or any other source.

The Granger causality test does not allow us to reject the null hypothesis that there is no causal relationship between solar fluctuations and the North Atlantic Oscillation.

Red-noise pseudodata shows much the same strong signal in the 9-13-year range as is shown by the DJF NAOI data.

And finally … does all of this show that there is no causal relationship between solar fluctuations and the DJF NAO?

Nope. You can never do that. You can’t demonstrate that something doesn’t exist.

However, it does mean that if such a causal relationship exists, it is likely to be extremely weak.

Regards to all,

w.

My Customary Request: If you disagree with someone, please quote the exact words that you object to. This lets us all understand the exact nature of your objections.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

129 Comments
Inline Feedbacks
View all comments
September 18, 2015 2:50 pm

again Willis, concise, clear. Thank you

RCS
September 18, 2015 3:05 pm

1) Have you determined the distribution of the cross correlation between randomly generated sets of red data? Specifically there are minima at ~ +/- 10 years in the CCF. Since you are generatng random data, with a random inter-sample phase relationship, calculation of the limits of these minima are important.
2) The paper appears to discuss non-linear systems. The periodograms are typical of entrainment which occurs when a non-linear system is perturbed with a quasi periodic input. Therefore I don’t think that linear analysis is probably not sufficient to determine a relationship.
I agree that modelling in this case gives some questionable results.

Evan Jones
Editor
Reply to  Willis Eschenbach
September 18, 2015 5:39 pm

Also in conformity with the terminology of scientific method (“as she is spoke”). Even if on further review you are challenged as to your findings, this is of illustrative valuable for that reason alone.

george e. smith
Reply to  Willis Eschenbach
September 18, 2015 9:15 pm

“””””….. If you flip a group of 8 coins once and they come up all heads, that’s very unusual. But if you throw the same group of 8 coins a hundred times, somewhere in there you’ll likely come up with eight heads. ….”””””
If you flip a set of eight coins bearing the serial numbers 1,2,3,4,5,6,7,8.
And you record which coins come up heads and which coins come up tails, how likely is it for any one of those patterns to come up ??
And is that probability any different from the probability of all heads, or all tails ??
g
on another subject. You plot the discontinuous function consisting of straight line sections from dot to do.
Then you filter it with your low pass filter and voilla, you now have a continuous function.
If one assumes that the discontinuous function is actually just samples of what presumably is actually also a continuous function; why does nobody use some sort of cubic spline or other interpolation algorithm to approximate the original continuous function from which the samples were obtained.
I make ” scatter plots ” all the time, and plot them $ Excel, and it will draw a nice smooth curve through my data points with no discontinuities; and that presumably is a more accurate presentation of that data.
g

george e. smith
Reply to  Willis Eschenbach
September 19, 2015 6:46 am

If you make your own ‘ coins ‘ by laser cutting them out of wafers of diamond, that are perfectly flat and aligned to a particular crystal orientation (you can figure out which ones), then you now have eight identical coins with indistinguishable faces.
If you toss those coins once; each coin, then there are 256 different ways the coins can land, and none of those different patterns re likely than any other pattern.
Of course you can’t see the pattern, because you can’t distinguish the coins from each other or the two faces. But Mother Gaia knows what the pattern is, and if she would speak up she will tell you there are 256 different but equally likely patterns.
Now if instead of diamond wafers, we cut these coins from wafers of Gallium Arsenide (GaAs) , then now we can distinguish the heads and tails. There is a Gallium surface, and there is an Arsenic surface.
There still are 256 different patterns all equally likely, and the age Joe can’t tell which is which; but Mother Gaia knows. Well we can add the serial numbers to see the pattern ourselves.
So you see, the case of all heads or all tails or hthththt, or thththth, is no more unlikely than any other result.
It is the observer who arbitrarily chooses to ignore the fact that each coin is unique.
So 8 heads isn’t any more unlikely than 5 heads, or 3 heads.
So why don’t we ignore the Gallium and Arsenic faces, and then claim that we never ever get a different result when we toss eight coins; they always land the same way.
Statisticians like to claim all sorts of erroneous things about numerical configurations, that simply aren’t true.
The infamous case of the first Selective Service draft lottery, with 366 ! possible outcomes, where some statisticians claimed it was not random, is easily resolved by substituting 366 different icon pictures for the numbers 1-366.
Immediately, ALL patterns disappear, and no selection order, is more unlikely than any other including pulling the dates in calendar order.
g

Evan Jones
Editor
Reply to  RCS
September 18, 2015 5:48 pm

Yes, always use real data.
Sometimes it must be adjusted. It’s true. (Mosh is right about that, though my method is different.) But it must be a demonstrable adjustment, fully explained, with the raw data it was derived from archived. All data. All methods.

richard verney
Reply to  Evan Jones
September 18, 2015 11:13 pm

Willis
Whilst one can be reasonable certain that the 160deg entry is erroneous, and whilst one can understand why you consider that the entry ought to have read 16deg, no one can be certain that that is indeed the case.
For example, perhaps the true data was 18 deg but the 8 was carelessly written at an angle and was not a continuous stroke of the pen such that the transcriber thought that the 8 looked more like a 6 and a 0 (ie., 60), thereby transcribing 18 as 160.
When there is an obvious error in the data, it may be better to simply ignore that one entry. To seek to ‘correct’ it may impose a different ‘error’
If the data stream is too riddled with errors, perhaps the better option is to accept that the data is not fit for purpose rather than seeking to ‘correct’ almost every entry recorded. If only it were as simple as having a control whereby one knows and has measured that the new thermometer is reading 2 deg C warmer than the old. However, that is not the position in the temp data sets where there are numerous (and endless adjustments and readjustments to the past) each of different extent.
Whilst you are correct that the data sets are thin on the ground, it would be preferential to conduct an audit of the quality of the data from each weather station and ascertain those that are best sited and those with the most reliable data (consistent) and those that have the longest record of consistent data, and work with the few.
I would suggest that it is better to work with a few good quality data sets rather than lots and lots of rubbish data. Whilst everyone knows that there is no such thing as GLOBAL warming (climate is regional and so is the response), IF GLOBAL warming is truly global why would one need more than say a few hundred well spaced good quality temperature data sets? It makes no sense to include a load of rubbish just to bring up numbers.
In fact just look how sparcely sampled the globe is particularly in central Africa, the central belt of South America, central Australia, the great plains of Russia, Northern Canada, Alaska, the Poles etc.
The problem with the thermometer record is the lack of a quality audit of the stations themselves, and thence a proper quality audit of the raw data from the best sited stations. Climate scientists should have worked with the cream, not the crud..

Reply to  Evan Jones
September 19, 2015 10:10 am

Rv
It is not just what data and how to correct it is also who corrects it. For the same reason Willis brings up coin flips people invaribly see and create patterns in thigs that don,t have them. You can tell a random set of numbers generated by humans vs thoughs generated by a computer because humans try to make things look like there isnt a pattern. Humans would rarely to never generate a number like 5555 the would generate 54732 or some thing that doesnt look random. People will subconsciously see patterns that conform to there nature and ignore patterns that dont. If you are inclined to believe the earth is warming when you examine data that supports this view inconsistencies to the warm side while make sense and you will tend not to correct them however the opposit will be true of inconsistencies toward the coolling side. This would not be an issue if the people doing the work were of random persuasions but they are not and thus you would expect to see corrections that are one sided and i believe we do.

Reply to  Evan Jones
September 19, 2015 12:38 pm

“Whilst one can be reasonable certain that the 160deg entry is erroneous, and whilst one can understand why you consider that the entry ought to have read 16deg, no one can be certain that that is indeed the case.”
No one can be certain that any observer ever wrote down any number accurately.
no one can be certain that calibrated instruments dont go out of calibration when you
are not looking and back into calibration when you check their calibration.
measurement is not about certainty. Logic and math (2+2 =4) is about certainty.
here is a sweet example.
In a UHI study I did I used raw data. I forgot to flip the QC switch so I even used bad data.
one data point was 15000C
I was happy because the study showed a UHI effect.. when I turned QC on… the effect … opps
got cut in half.
When you deal with historical data there are choices you have to make. picking only the “good” data
is still a choice since there is no clear definition of what “good” data is.
The best you can do is make your choices, document your choices, test other choices.
and recommend better approaches as folks move forward.
I’ve yet to see a skeptic actually try to implement and defend any approach. except perhaps jeffid and romanM. And they found that the earth has warmed.

September 18, 2015 4:14 pm

Out of the ballpark Willis.
I look forward to seeing more “is it noise?” analysis. It’s currently my favorite topic.
Noise can also exhibit trends. You should run with that and see if the trend 135 in global temperature records is significant or not – is the trend just random variation of the long term temperature?
Here’s a paper that talks about the general topic of using noise to determine if a signal component is significant or not:
https://www.dropbox.com/s/lw1kzdfjw0ifcdo/10.1.1.28.1738.pdf?dl=0
(from http://paos.colorado.edu/research/wavelets/)
Also this book on long-term dependencies talks a bit about the topic and has lots of fun references I’m following up on:
https://books.google.es/books?id=jdzDYWtfPC0C&redir_esc=y
(paper: http://wwwf.imperial.ac.uk/~ejm/M3S8/Problems/beran92.pdf)
I’m still trying to find the original papers on the general methods of using appropriately shaped noise in Monte Carlo simulations to see if a signal has significance. Let me know what you have read please.
Best regards,
Peter

September 18, 2015 4:29 pm

In the presence of Lorenz-type transients, the effect of systematic environmental changes on present-day climate (changes, for example, involving secular increases of CO2 or other consequences of human activities) might be so badly confounded as to be totally unrecognisable.
From j Murray Mitchell’s introductory presentation to the Climatic Change Workshop at the SCEP conference, 1970. (He was using Lorenz’s non-linearity findings to draw into question the presumption that climatic equilibrium was a ‘slave’ to external forcing. Lamb was most impressed by this short paper and it dominated his review of SCEP in Nature.)

Dawtgtomis
September 18, 2015 4:33 pm

” I am saying that the usual terminology style of the modelers makes little distinction between real and modeled elements, with both often being called by the same name, and this mis-labeling does not further communication.”
That is the secret of their modus operandi, the vague terminology allows believers a conflation of virtual and factual reality to create the illusion of impending climate disaster.

September 18, 2015 4:55 pm

Also higher pressures results in higher temperatures not lower ones.

Gary Pearse
Reply to  wickedwenchfan
September 18, 2015 9:06 pm

A stationary warm spot causes air to rise -low pressure.

Victor Vector
September 18, 2015 5:24 pm

Hi Willis,
In regards to the Bonferroni Correction, it applies to testing multiple hypothesis against one trial. That is, asking many questions of the data for ONE trial. It is obvious that the more questions you ask about your data sample the more likely you are to get a statistically significant answer to one of those questions (a bit like the birthday problem) even when it is in error. That is why the significance probabilities are adjusted in an attempt to take this situation into account.
If this is what you were implying then I found you were not clear enough, you seem to be saying that looking for one particular result in many different subsets would result in the answer you were seeking, which is something different.
The more trials you perform can only increase your confidence in the analysis if the results of each trial are statistically valid. And (thinking of coins) one of those results may be 8 heads. But the overall probability in the sequence of trials of 8 heads appearing should represent the actual probability of that event occuring, and you should not present the results of this one trial as proof of your hypothesis.
You seem to be saying that they use multiple views on the data to obtain the result they are looking for (I have not checked to see of this is the case.)
I am new at statistics and like you I am quite prepared to be wrong (life is a learning experience), but as I have read this, this is how I see it.
Warm Regards,
Mark.

Reply to  Victor Vector
September 19, 2015 12:48 pm

Mark, Willis is correct.
he is one of the few folks who get this dilemma.
I realized it when I was looking for GCR signals in cloud data.
The cloud data was years of daly cloud data. In geographic bins.
and at different pressure bands ( like 12 )
So think about that.
Here is what the theory says
1. When GCR increase there is possibility of increased formation of CNN and thus of clouds
But its not that simple. What if there are already 100% clouds in an area when the GCR increase?
What if there are already Enough CNN.. You see the process has a limit…
So now you start to look for increasing clouds.. where do you look? you’ve got thousands
of cells around the globe and 12 different pressre levels.
So you look globally at high clouds.. no joy, lower clouds, no joy, lowest clouds.. no joy.
Then you think.. maybe only high latitude clouds.. so you bin by latitude and test again
no joy
Then you think.. ok, I have to only look at cells which were first clear sky and then become cloudy
haha.. like clouds dont move…
Suppose in the end you find clouds over holland increase.
What have you shown?
Well when you start with a VAGUE theory and No specific prediction, you end up hunting through data.
you are doing EDA NOT hypothesis testing
Once you finish your EDA.. then you are in a position to wait for more data and see if your findings hold

johann wundersamer
Reply to  Steven Mosher
September 19, 2015 8:15 pm

Suppose in the end you find clouds over holland increase.
What have you shown?
in the netherland mountains. mountains.
wholly go lightly.

jorgekafkazar
September 18, 2015 5:33 pm

Noisy signal + Filter + Wiggle-matching = spurious results.
I, too, noted the usual wiggle-matching exercise in the original post, so didn’t comment. One variable aligns with the other perfectly… except where it doesn’t. And I noticed where you snuck the actual DJF NAOI into Figure 2b. Clever dog! Nice job all around, in fact.

Curious George
September 18, 2015 6:04 pm

Good job, Willis, thank you. I wonder why Nature’s peer review did not notice that when you run a noise through a 9-13 year bandpass filter, you find a signal. Or try a 10-12 year bandpass, or a 10.5-11.5 year bandpass. A signal is always there, what a surprise – and a level of mathematical sophistication in the climatologic community is such they consider it worth publishing. (Needless to say, the signal will be there even with a 15-17 year bandpass, but then the Sun would no longer be an obvious culprit.)

emsnews
September 18, 2015 6:12 pm

They use ‘model earth’ because they hate reality. It is obvious they are aiming at pleasing powerful people who want to cry ‘wolf’ or ‘fire’ in a crowded theater so they can fleece the sheep. Nothing is allowed to stand in their way so ‘researchers’ who cry wolf and fire are richly rewarded so we get an army of howling lunatics yelling we are going to roast to death unless we tax CO2 exhalations.

Pamela Gray
September 18, 2015 6:15 pm

As usual, you are far more adept at presenting the case than I will ever hope to be. Especially since mothballing my old mac and the Statview software I loved so much. These days, long days in the classroom, and many weekends doing the same, leaves my evenings reserved for putting my feet up in an easy chair. Besides, laying out pearls before swine will not result in the swine being anything other than the pig it was before wisdom was offered.

Reply to  Pamela Gray
September 18, 2015 6:36 pm

No need for cynicism dear, Willis’ essay validates my notion that science can be explained to anyone so long as the person doing the explaining is aiming at the truth!

September 18, 2015 6:54 pm

Running models with innovative inputs and publishing the results has become a cottage industry but the findings are not converging into a unified synthesis of a theory of global warming. Thank you for this report. Really appreciate the high quality of the Eschenbach posts.

September 18, 2015 7:26 pm

“the authors haven’t heard of Bonferroni and his correction”
The so called Bonferroni correction was proposed in a paper by Holm in 1979.
I am not sure why it is called the Bonferroni correction or whether there was someone named Bonferroni who had something to do with Holm’s paper. [Holm,S. (1979). A simple sequentially rejective multiple test procedure, Scandinavian Journal of Statistics,6:2:65-70.]
It is an important issue and one often ignored in climate science and by the IPCC
The IPCC uses an apha value of 0.1 per comparison
At that rate if you make say 5 comparisons the probability of finding at least one false “effect” in random numbers is 1-(1-0.1)^5 = 41%.
This is why climate science is littered with spurious findings.
Also their use of alpha values of 0.05 and 0.10 is inconsistent with “Revised standards for statistical evidence” published by the NAS in which they propose an alpha of 0.001 to improve reproducibility of results. Here is the link: http://www.pnas.org/content/110/48/19313.abstract

September 18, 2015 8:15 pm

from the paper: Given the quasi-oscillatory behaviour of the solar cycle and that
4% of the total NAO variance can be explained by the solar
variability in our SOL experiment, we believe that this mechanism
could potentially improve decadal predictions. Although this
contribution is relatively small regarding the NAO total
variance, it represents a significant increment to other sources
of predictable decadal variability35

from you: However, it does mean that if such a causal relationship exists, it is likely to be extremely weak.
Thank you for your essay.

Reply to  Willis Eschenbach
September 18, 2015 9:02 pm

They claim that 4% of the ModelEarth NAO variance is due to an imaginary sun. I don’t doubt that that is possible. Imaginary suns are known to have strange powers.
I was wondering how you would word your response. I didn’t want to write a leading question. No, really! That’s a good response. I have to reread the paper, but like you I did not find a direct model to data comparison. Their 4% figure comes across as a sort of extremely high upper bound.
It’s kind of like: No matter how you torture the data, they’ll only confess to 4%, and it isn’t credible.
Somebody invented the phrase “metaphysical anguish”, to denote the case where a scientist or philosopher has done much work and come up with little. iirc, it was Lovejoy in “The Great Chain of Being”. Nowadays we call it cognitive dissonance. After all that work, the authors just can’t not publish; and besides, they have to publish or perish. Thanks again for your work.

jorgekafkazar
Reply to  Willis Eschenbach
September 19, 2015 4:42 pm

True, we need to keep in mind that no results ARE a result. Just because your hypothesis is rejected doesn’t mean you haven’t accomplished anything. The day academics equated falsifying a hypothesis with lack of advancement was the day Science started to decay.

Scott
September 18, 2015 8:16 pm

Willis,
Don’t these people have others in their field take a look at the premise, data, suppositions and conclusions – even BEFORE they would submit to peer review.
You often poke giant holes in researchers papers – many glaring – that these “scientists” seem to miss the obvious?
Has science come to this?…….

Reply to  Scott
September 18, 2015 9:19 pm

The giant holes are almost everywhere you look. Here is the story of how Kerry Emanuel, a climate scientist of great repute, tried to find a rising trend in hurricane activity. First he tried the ACE index (wind speed squared) but no trend was found.So he arbitrarily cubed the wind speed and called it the PDI (power dissipation index). But no trend could be found. So he took a moving average of the PDI. Still no trend. Then he took the moving average of the moving average and voila, there was the trend he was looking for. This paper was not only published but became a seminal paper in hurricane research. Thayer Watkins (SJSU) did a detailed analysis of the Emanuel paper Eschenbach-style. It would make a great WUWT post. Here is the link:

Reply to  Chaam Jamal
September 18, 2015 9:21 pm
Jeff Alberts
Reply to  Chaam Jamal
September 18, 2015 10:31 pm

Wow. 1990 wants its web site back.

Billy Liar
Reply to  Chaam Jamal
September 19, 2015 12:56 pm

I’d never heard of PDI until a few days ago when reference to it appeared in a WUWT post. When I looked it up, I could not understand why it had been created when a metric, ACE, already existed.
Thanks for your explanation (and the link)! I have consigned PDI and Kerry Emanuel to the trash bin of my mind.

September 18, 2015 8:34 pm

Here is what they say about statistical significance: Statistical significance analysis. Given the high degree of serial correlation in the
low-pass filtered time series, the significance of correlation between filtered NAO
and F10.7 indices were assessed using a nonparametric random phase test27. This
method preserves the spectrum and auto-correlation of the original data. In
practice, we generate 1,000 synthetic random filtered NAO time series having the
same power spectrum as the original one and we correlate each against the original
F10.7 time series. The 1,000 correlation coefficients are used to construct a
probability distribution of correlations. Regarding the composites, the significance
level is estimated using a bootstrapping technique with replacement. The procedure
is to select two random subsets from the original time series with the lengths equal
to the two original composite subsamples. This procedure is repeated 1,000 times
and a distribution of the differences is constructed. Finally, correlations and
composite distributions are used to determine the likelihood of the derived signals
arising by chance. One-tailed tests are used.

A one-tailed test with the alpha level set to 0.1 is not very impressive. Could you count how many hypotheses they were testing with their procedure? It looked to me like only 1 significance test (on the lag, or phase), but it followed a lot of judgments and choices. And that is following a history of publications of many models on these data. I have sort of come to a point where, unless there are true out of sample data, it’s just another modeling experiment for which any honest calculation of a “p-value” is close to impossible. It goes into the large collection of such models. It might be interesting to revisit it in 40 years. Sounds anti-intellectual, I know. Your effort was informative though, and I appreciate it.
I had not previously been aware that the paper was open to the public. Thank you again for the link.

jonesingforozone
September 18, 2015 8:50 pm

This 145 year study is limited by quality of proxies available.
For example, there is no reason why the F 10.7 cm microwave index should track EUV irradiance having much shorter wave length. The justification for using it is that the proxy has correlated well for a few relatively short periods of observation in the distant past.
See Does the F10.7 index correctly describe solar EUV flux during the deep solar minimum of 2007–2009? and The ionosphere under extremely prolonged low solar activity.

Mike M. (period)
September 18, 2015 8:58 pm

Willis,
I do not think that you are being at all fair to Thiéblemont et al. Science often proceeds in small increments, and this would seem to be a case of that.
You wrote: “”The authors’ contention is that the sun acts to synchronize the timing of these swings to the timing of the solar fluctuations.”
Where do they claim that? What I found (last sentence of abstract and introduction) was a much weaker statement: “The synchronization is consistent with the downward propagation of the solar signal from the stratosphere to the surface.”
You wrote: “Why not start by analyzing the real Earth before moving on to ModelEarth?”
That is exactly what they did. The very first sentence of the introduction: “There is increasing evidence that variations in solar irradiance at different time scales are an important source of regional climate variability”. That is followed by a brief discussion with a whole list of references.
You wrote: “If you use a cyclical forcing as input to a climate model, do not be surprised if you find evidence of that cycle in the model’s output … it is to be expected, but it doesn’t mean anything about the real world.”
Well the first sentence of the abstract is: “Quasi-decadal variability in solar irradiance has been suggested to exert a substantial effect on Earth’s regional climate.”
The key word here is SUBSTANTIAL. The issue is not whether there is some effect (of course there must be some signal, however tiny), it is whether it might be large enough to matter and whether the claimed lag is reasonable. Those are not questions that can be answered without investigation.
You wrote: “The cross-correlation of a century’s worth of data shows that relationship between the sunspots and the DJF NAOI is not statistically significant at any lag, and it does not indicate any causal relationship between solar fluctuations and the North Atlantic Oscillation”
The paper cites a dozen or so peer-reviewed papers claiming the opposite. I have no idea if they are right or if you are right. But if you want to convince scientists that published results are wrong, you actually have to address the analyses that were published.
I am not saying there is nothing wrong with the paper, Given its extremely weak conclusion, they do seem to be over-hyping it. There is a lot of that going around, in all fields of science. But viewed as an investigation into the question “is this idea even plausible” it may well have some value. They never seem to actually come out and say that is what they are doing; that is another common failing of scientific papers, related to the over-hyping. And they never really make it clear that it is plausible; probably yet another symptom of the over-hyping.
But your criticism is off base and unfair.

Mike M. (period)
Reply to  Willis Eschenbach
September 19, 2015 3:11 pm

Willis,
“Mike, you are starting out very poorly when you say they didn’t make a claim that the sun acts to synchronize the timing of the NAOI when it is in the damn title”
It does not seem to be in the abstract or conclusions; that tells me that the title is click bait. So it is part of the authors’ excessive hype. That makes them guilty of bad behavior, not bad science. Such hype has become depressingly common in science (not just climate science). I should have noted this in my post, but I tend to just tune such misleading titles out.
“No, they did NOT analyze the real Earth at all. They merely referred to other people who had done so, then went off to play with their models.”
In the scientific literature, those are pretty much the same thing. If you include a repeat of prior analysis, you will be told to remove it, unless nobody notices.
“So the fact that they have cited other papers means nothing. ”
By that standard the scientific literature means nothing. Progress in research is incremental, so you have to rely on what has been published. That should not be done blindly, but it has to be done. The flood of poor research in the ever expanding literature is indeed a problem and the normal corrective mechanisms seem (to me at least) to be failing. But to just throw out the entire published literature is not a solution.
“You seem to misunderstand what science is. Science is generally an attempt to determine what is actually true, not a quest to determine the limits of plausibility.”
You are the one who misunderstands science. You can not really determine what is true, you can only determine what is false and narrow down the limits of what may be true. Assessing plausibility is part of the narrowing down process. It is a small step, but science is incremental.
“Plausible” means nothing to me, I’m interested in facts.”
I am very surprised to hear you say that. I have read many of your articles with great interest. I don’t think I have seen you establish any facts. I have seen some very interesting investigations into what might or might not be plausible.
Perhaps there is a semantics issue here. I have considerable expertise in the area of chemical kinetics. We teach students that in comparing a proposed mechanism to experimental data, there are two possible results: either the mechanism is proven wrong or the mechanism is plausible.

Justthinkin
September 18, 2015 9:16 pm

And what all this boils own to….its the SUN idjits.

Gary Pearse
September 18, 2015 9:18 pm

It seems to me that if the sun’s activity were to cause NOAI, it would also cause a twin synchronously in the Pacific.
Also from your comment re quality control changes: “.. any adjustments need to be documented and explained, with the data saved at all steps. I believe this is the case with the Berkeley Earth data.”
It brings up the question, now that the record has been rejiggered to end the dreaded “pause”, what does this do to such as BEST. Will they make adjustments to jibe with T. Karl’s swift tailoring and the new paper also showing that there is no pause? Has Steve Mosher taken this on?
http://wattsupwiththat.com/2015/09/17/the-latest-head-in-the-sand-excuse-from-climate-science-the-global-warming-pause-never-happened/

ren
Reply to  Gary Pearse
September 19, 2015 12:56 am
Mike
September 18, 2015 11:50 pm

Lots of valid points Willis but one fundament misconception:

This kind of appearance and disappearance of apparent cycles, which is quite common in climate datasets, indicates that they do not represent a real persisting underlying cycle.

You are misunderstanding what you are seeing. The waxing and waning of the amplitude and the change of direction around 1945 is typical of what you get when you mix two close, purely harmonic functions. Try plotting a few examples and you will see what I mean.
This kind of pattern does not mean that cycles are not there, it is a result of them slipping in and out of phase due to their different periods. Some times they cancel sometimes they add.
I think you are confusing this with the fact that climate data ( eg temps ) are sometimes in phase with solar leading people to conclude prematurely there is a link and then they close their eyes to periods when it is perfectly out of phase with solar or simple does not match.
Even this latter case could be that there is some other periodic driver ( one could think of circa 9y lunar effects ) mixing with solar in the way I describe above.
As you rightly say, climate is a complex system. Simplistic dismissal can be as wrong as simplistic attribution. Be careful.

Mike
September 19, 2015 12:08 am

This means that they’ve looked in five places to get their results

This is an important and valid statistical point but I think you are wrong in saying the authors have looked and not reported negative results. This is a collective error of climatology. NAO is often analysed with only winter months probably because it “works better”. That justifies your point. I think it likely that the authors were just following this established idea, rather than having looked and not reported.
They should have looked if the want to do this. Your criticism has much wider application than just this paper. It is now an institutionalised error.

Mike
September 19, 2015 12:10 am

[Try again]
Lots of valid points Willis but one fundament misconception:

This kind of appearance and disappearance of apparent cycles, which is quite common in climate datasets, indicates that they do not represent a real persisting underlying cycle.

You are misunderstanding what you are seeing. The waxing and waning of the amplitude and the change of direction around 1945 is typical of what you get when you mix two close, purely harmonic functions. Try plotting a few examples and you will see what I mean.
This kind of pattern does not mean that cycles are not there, it is a result of them slipping in and out of phase due to their different periods. Some times they cancel sometimes they add.
I think you are confusing this with the fact that climate data ( eg temps ) are sometimes in phase with solar leading people to conclude prematurely there is a link and then they close their eyes to periods when it is perfectly out of phase with solar or simple does not match.
Even this latter case could be that there is some other periodic driver ( one could think of circa 9y lunar effects ) mixing with solar in the way I describe above.
As you rightly say, climate is a complex system. Simplistic dismissal can be as wrong as simplistic attribution.
Be careful.

Mike
Reply to  Mike
September 19, 2015 12:15 am

What your ARMA tests show is that random data contain periodicities that can be picked out with a bandpass filter and will add together to produce this sort pattern. The question is whether the pattern is synchronised with solar.
Their fig 2 is clearly drifting out of phase by the end of the their simulated data. That makes the case for the “phase-locking” they suggest very tenuous.

Reply to  Mike
September 19, 2015 5:51 am

Mike: Even this latter case could be that there is some other periodic driver ( one could think of circa 9y lunar effects ) mixing with solar in the way I describe above.
Mike: This is a collective error of climatology. NAO is often analysed with only winter months probably because it “works better”.
These are the basic two reasons for what I called my “anti-intellectual” approach to trying to assess the “statistical significance” of results of new analyses of well-worked data sets. You purely and simply can not discern how much data selection has gone on before based on a probably adventitious co-occurrence spotted via graphical or some other analysis. And then you can not tell how often what started as a straightforward hypothesis got reformulated into multiple mathematical models until at last one of them produced a apparently statistically significant improvement in fit. The procedure has an associated p-value of approximately 1, since a dedicated team is likely eventually to find good model fits to (selected) data even if the time series being investigated are statistically independent. Without out of sample data for a true test of model fit, the p-value is essentially a measure of the dedication and skill of the team, not a measure of whether the associated model fit statistic could have an unusual value under the null hypothesis. Such dedication and skill deserve respect, but the results ought not be deemed reliable indicators of what’s happening in the world.
You may be aware that studies have revealed a high proportion of non-reproducible results in medical, neurophysiological and psychological journals. They result from researchers being too casual in addressing problems such as these.

Mike M. (period)
Reply to  Mike
September 19, 2015 8:27 am

Mike wrote: “As you rightly say, climate is a complex system. Simplistic dismissal can be as wrong as simplistic attribution.”
Well said.

1sky1
Reply to  Mike M. (period)
September 19, 2015 4:58 pm

Dismissal of one misguided idea by appealing to another one is exactly what is being done here. Totally absent is any analytic comprehension of RANDOM signals with CONTINUOUS power spectra of various bandwidths, which manifest the waxing and waning of irregular wave-forms often seen in nature. Periodograms are simply not an adequate analysis tool for such geophysical signals.

1sky1
Reply to  Mike M. (period)
September 21, 2015 2:18 pm

That random signals have a different spectral structure than periodic signals, requiring different analysis techniques, is common knowledge among qualified signal analysts. Wholly oblivious to this, Willis proves incapable of uniquely identifying even a pair of pure sinusoids without leakage through the spectral window of the periodogram, If he spent half the time in studying standard texts (e.g., Parzen, Bendat & Piersol, Koopmans, etc.) as he does on carping on lack of rigidly formulaic approach to his learning in my critical comments, he would address the issues raised, instead of hiding his obvious failings behind a facade of patronizing ad hominems.

1sky1
Reply to  Mike M. (period)
September 23, 2015 2:12 pm

When someone is manifestly oblivious of the fact that the true periodogram of a pair of pure sinusoids consists of only two lines, not a dozen as shown in Willis’ figure , and that the difference between random signals and periodic ones is the spectral density continuum and decaying acf, rather than line structure and periodic acf, it’s apparent that he’s the one who should worry about being taken “seriously” as a signal analyst. And when someone pretends that the Texas saying of “all hat and no cattle” hails from cowboys in Hawaii, it’s apparent that someone’s flinging cow-chips.
The whole idea that criticism is valid only if the entire problem is reworked by the critic is a hilarious evasion. Because of WUWT’s permissiveness in publishing quirky, personal conceits and totally inept conjectures, I’ll never allow my work to appear on these pages.

September 19, 2015 12:23 am

If applying a moving average filter to random data produces an oscillatory output, then the filter is ill designed for the application.
Any electronic engineer will tell you that what you are doing here is the equivalent of a low pass filter, which can gave various values of ‘Q’. Too much ‘Q’ will give you the illusion of a tone, where none exists in the input data.
This is yet another effect well known to engineers that is coming to light in other areas. The spatial filtering applied by the neural net software used for pattern recognition that resulted in hallucinatory appearances of buildings and dogs faces in clouds and other photographs is yet another example of emphasising a pattern, where no pattern exists, simply because you have looked too hard for it (turned up the gain).
http://googleresearch.blogspot.co.uk/2015/06/inceptionism-going-deeper-into-neural.html

Mike
Reply to  Willis Eschenbach
September 19, 2015 2:34 pm

What is the MA in ARMA ??
Putting aside the fact that MA is a very bad filter to choose, I found your comment about negative MA being common is climate data interesting. It probably reflects long term negative feedbacks dominating the system. ie random input , autoregressive integrating nature of oceans and persistent negative feedbacks constraining changes.

ren
September 19, 2015 12:41 am

F10,7 is perfectly entitled, as it relates to changes in the UV and is not consistent with the number of spots.

ren
Reply to  Willis Eschenbach
September 20, 2015 2:25 am

The amount of spots and F10,7 differ from each other.
http://www.solen.info/solar/images/solar.png

jonesingforozone
Reply to  ren
September 19, 2015 1:45 pm

For the last twenty years, EUV irradiance has diverged significantly from the F10.7 cm microwave index, the Mg II index, sunspots, and the solar quiet variation.
This effect is directly observed by increased space junk accumulation and less drag on satellites, for example.

Verified by MonsterInsights