Guest Post by Willis Eschenbach
I’ve been investigating the use of the “complete ensemble empirical mode decomposition” (CEEMD) analysis method, which I discussed in a previous post entitled Noise-Assisted Data Analysis.
One of the big insights leading to modern signal analysis was the brilliant idea of Joseph Fourier. He realized that any given waveform can be expressed as a combination of sine and cosine waves. However, there are other ways besides Fourier’s method to decompose a signal, including periodicity analysis, principal component analysis, and CEEMD.
Let me give you an example of a CEEMD analysis. Here are the intrinsic modes for the annual average number of sunspots from 1700 to 2014. The top row is the sunspot data itself. For intercomparison with other signals, it is standardized to a mean of zero and a standard deviation of one.
Figure 1. CEEMD analysis of the mean annual sunspot numbers. Top panel shows the sunspot data, standardized to a mean of zero and a standard deviation of one. Panels marked C1 – C7 show the intrinsic modes of the signal. The bottom line shows the residual, meaning what remains after the removal of modes C1 – C7 from the signal. Note that all intrinsic modes are displayed at their true size, with all scales being the same.
This is a “complete decomposition” of the raw data signal, meaning that if we add the intrinsic modes C1-C7 together plus the residual, it will faithfully and exactly reconstruct the original signal.
One thing I like a lot about the CEEMD analysis is that I can actually see how the underlying intrinsic modes vary over time. I’ve said before that ascribing an inherent cyclical mechanism to natural observations is fraught with problems. Figure 1 is a good example of these problems. Look at intrinsic mode C4. It has a small signal at about 22 years … but not all of the time. For most of the first century of the record there is little signal at all. Then there’s an intermittent small ~ 22-year signal from about 1780 to 1850 ,,, which fades out and after a few year hiatus is replaced by a single ~ 25-year cycle, and that in turn is replaced with a ~ 22-year cycle out to the end of the data.
Or we can consider intrinsic mode C6, which varies in a similar irregular fashion. Mode C6 has a couple of strong cycles with a period of around 90 years at about 1800, and it then kind of tails off to nothingness. This makes it obvious why it has been so hard to discuss the existence or non-existence of the so-called “Gleissberg Cycle”, which Gleissberg claimed was ~ 80 – 100 years. When cycles come and go like that, it is hard to draw any firm conclusions. I mean, the ~ 80 – 100 year cycle is definitely there … but it’s only there when it is there, and the rest of the time, well, it’s simply not there.
Unfortunately, these kinds of appearing and disappearing cycles are far too common in natural datasets. There is a great temptation to think that they can be used for forecasting purposes … and they could if we ignore Murphy’s Law, which says that as soon as you start prophesying, the cycle will die out. For example, if we looked at the sunspot data in the year 1900, we’d think that there was a strong, statistically significant hundred-year cycle in the data … but after 1900 the cycle simply fades out to nothing.
Having seen the actual waveforms of the intrinsic modes in Figure 2, we can look at the periodograms of the various intrinsic modes to see what kind of signals exist in each of the intrinsic modes C1 to C7.
Figure 2. Periodograms of each of the intrinsic modes C1 through C7 of the annual mean sunspot number, 1700-2014. These show the strength of waves of the various periods in each on the intrinsic modes.
Figure 2 shows that most of the energy is in the ~11 year cycle, which is in intrinsic mode C3. Because the sunspot cycle varies between ten and thirteen years, the energy is not a sharp spike, but has energy across that range.
As discussed above, intrinsic mode C4 can be seen to have a very small bit of energy in the 22 year range, but as we saw in Figure 1, there’s nothing regular enough in the data to give a strong signal.
Again as discussed above, intrinsic mode C6 seems to have some energy in the 90-100 year range … but as Figure 1 shows, the ~100 year signal in C6, while strong, is mostly visible in the first half of the data. This greatly increases the odds that it is a spurious signal that could disappear in a longer record.
So that is an example of a CEEMD analysis of a signal. It gives us a picture of the intrinsic modes (Figure 1) and the periodograms of those same intrinsic modes (Figure 2). It shows the strength and the ebb and flow of the underlying cycles in the data.
Now, how else can this kind of analysis be useful? Well, it can show whether and how two distinct observational datasets might be related. As an example, here is the CEEMD analysis of both the Nino3.4 Index and the Southern Ocean Index (SOI). The Nino3.4 Index is a detrended sea surface temperature dataset for an area in the tropical central Pacific covering 5° North to 5° South and 120° West to 170° West. The SOI, on the other hand, is an index of the atmospheric pressure difference between Tahiti and Darwin, Australia. The two indexes seem to be measures of the El Nino/La Nina pumping action. The SOI and the Nino3.4 move in opposite directions, so the SOI is usually displayed inverted so that peaks in the SOI correspond with peaks in temperature. The next two figures show the CEEMD analysis of the two datasets:
Figures 3 and 4. Upper figure shows the intrinsic modes resulting from the CEEMD analysis of the Southern Ocean Index (SOI, red) and the Nino3.4 Index (black). These two datasets cover the same period as the sunspot data shown in Figure 1, 1870 – 2011.
Again, we see that there are various cycles which are strong in part of the record, but disappear or are greatly diminished in other parts of the record.
Note the close correspondence of the decomposition of the two signals, both in terms of the strength and shape of the intrinsic modes, and in their periodograms. It is clear that regardless of the fact that the Nino3.4 Index and the Southern Ocean Index are measuring different variables, they are both a measure of the same phenomenon.
And it is equally clear that there is no significant sunspot signal in either the SOI or the Nino3.4 data—the CEEMD analysis shows little commonality. Unlike the sunspot data, in the SOI and Nino3.4 data there is little strength at 11 years, and little strength at around 90 years. And again unlike the sunspots data, the majority of the energy is in the short-cycle (2-6 year) part of the spectrum.
Anyhow, that’s why I’ve grown fond of the CEEMD analysis … it shows when datasets have related cycles, and when they are unrelated.
Pushing towards full moon tonight, with Jupiter and Arcturus vying for the moon’s attention … what a world …
My best to everyone,
My Usual Request: Misunderstandings bring communication to a halt, so if you disagree with me or anyone, please quote the exact words you disagree with so we can all understand your objections. I can defend my own words. I cannot defend someone else’s interpretation of some unidentified words of mine.
My Other Request: If you believe that e.g. I’m using the wrong method or the wrong dataset, please educate me and others by demonstrating the proper use of the right method or the right dataset. Simply claiming I’m doing something wrong doesn’t advance the discussion unless you can tell us how to do it right.
Yearly Sunspot Data: SILSO
SOI Data: Here
Nino3.4 Data: NOAA
Code: I’m using the “CEEMD” function in the R package “hht” for the analysis.