The Slow Fourier Transform (SFT)

Guest Post by Willis Eschenbach

While investigating the question of cycles in climate datasets (Part 1, Part 2), I invented a method I called “sinusoidal periodicity”. What I did was to fit a sine wave of various periods to the data, and record the amplitude of the best fit. I figured it had been invented before, so I asked people what I was doing and what its name was. I also asked if there was a faster way to do it, as my method does a lot of optimization (fitting) and thus is slow. An alert reader, Michael Gordon, pointed out that I was doing a type Fast Fourier Transform (FFT) … and provided a link to Nick Stokes’ R code to verify that indeed, my results are identical to the periodogram of his Fast Fourier Transform. So, it turns out that what I’ve invented can be best described as the “Slow Fourier Transform”, since it does exactly what the FFT does, only much slower … which sounds like bad news.

My great thanks to Michael, however, because actually I’m stoked to find out that I’m doing a Fourier transform. First, I greatly enjoy coming up with new ideas on my own and then finding out people have thought of them before me. Some folks might see that as a loss, finding out that someone thought of my invention or innovation before I did. But to me, that just means that my self-education is on the right track, and I’m coming up with valuable stuff. And in this case it also means that my results are a recognized quantity, a periodogram of the data. This is good news because people already understand what it is I’m showing.

Figure 1. Slow Fourier transform periodograms of four long-term surface air temperature datasets. Values are the peak-to-peak amplitude of the best-fit sine wave at each cycle length. The longest period shown in each panel is half the full length of the dataset. Top panel is Armagh Observatory in Ireland. The second panel is the Central England Temperature (CET), which is an average of three stations in central England. Third panel is the Berkeley Earth global temperature dataset. The fourth panel shows the HadCRUT4 global temperature dataset. Note that the units are in degrees C, and represent the peak-to-peak swings in temperature at each given cycle length. Data in color are significant after adjustment for autocorrelation at the 90% level. Significance is calculated after removing the monthly seasonal average variations.

I’m also overjoyed that my method gives identical results to its much speedier cousin, the Fast Fourier transform (FFT), because the Slow Fourier Transform (SFT) has a number of very significant advantages over the FFT. These advantages are particularly important in climate science.

The first big advantage is that the SFT is insensitive to gaps in the data. For example, the Brest tide data goes back to 1807, but there are some missing sections, e.g. from 1836-1846 and 1857-1860. As far as I know, the FFT cannot analyze the full length of the Brest data in one block, but that makes no difference to the SFT. It can utilize all of the data. As you can imagine, in climate science this is a very common issue, so this will allow people to greatly extend the usage of the Fourier transform.

The second big advantage is that the SFT can be used on an irregularly spaced time series. The FFT requires data that is uniformly spaced in time. But there’s a lot of valuable irregularly spaced climate data out there. The slow Fourier transform allows us to calculate the periodogram of the cycles in that irregular data, regardless of the timing of the observations. Even if all you have are observations scattered at various times throughout the year, with entire years missing and some years only having two observations while other years have two hundred observations … no matter. All that affects is the error of the results, it doesn’t prevent the calculation as it does with the FFT.

The third advantage is that the slow Fourier transform is explainable in layman’s terms. If you tell folks that you are transforming data from the time domain to the frequency domain, people’s eyes glaze over. But everyone understands the idea of e.g. a slow six-inch (150 mm) decade-long swing in the sea level, and that is what I am measuring directly and experimentally. Which me leads to …

… the fourth advantage, which is that the results are in the same units as the data. This means that a slow Fourier transform of tidal data gives answers in mm, and an SFT of temperature data (as in Figure 1) gives answers in °C. This allows for an intuitive understanding of the meaning of the results.

The final and largest advantage, however, is that the SFT method allows the calculation of the actual statistical significance of the results for each individual cycle length. The SFT involves fitting a sine wave to some time data. Once the phase and amplitude are optimized (fit) to the best value, we can use a standard least squares linear model to determine the p-value of the relationship between that sine wave and the data. In other words, this is not a theoretical calculation of the significance of the result. It is the actual p-value of the actual sine wave vis-a-vis the actual data at that particular cycle length. As a result, it automatically adjusts for the fact that some of the data may be missing. Note that I have adjusted for autocorrelation using the method of Nychka. In Figure 1 above, results that are significant at the 90% threshold are shown in color. See the note at the end for further discussion regarding significance.

Finally, before moving on, let me emphasize that I doubt if I’m the first person to come up with this method. All I claim is that I came up with it independently. If anyone knows of an earlier reference to the technique, please let me know.

So with that as prologue, let’s take a look at Figure 1, which I repeat here for ease of reference.

There are some interesting things and curious oddities about these results. First, note that we have three spatial scales involved. Armagh is a single station. The CET is a three-station average taken to be representative of the country. And the Berkeley Earth and HadCRUT4 data are global averages. Despite that, however, the cyclical swings in all four cases are on the order of 0.3 to 0.4°C … I’m pretty sure I don’t understand why that might be. Although I must say, it does have a certain pleasing fractal quality to it. It’s curious, however, that the cycles in an individual station should have the same amplitude as cycles as the global average data … but we have to follow the facts wherever they many lead us.

The next thing that I noticed about this graphic was the close correlation between the Armagh and the CET records. While these two areas are physically not all that far apart, they are on different islands, and one is a three-station average. Despite that, they both show peaks at 3, 7.8, 8.2, 11, 13, 14, 21, 24, 28, 34, and 42 years. The valleys between the peaks are also correlated. At about 50 years, however, they begin to diverge. Possibly this is random fluctuations, although the CET dropping to zero at 65 years would seem to rule that out.

I do note, however, that neither the Armagh nor the CET show the reputed 60-year period. In fact, none of the datasets show significant cycles at 60 years … go figure. Two of the four show peaks at 55 years … but both of them have larger peaks, one at 75 and one at 85 years. The other two (Armagh and HadCRUT) show nothing around 60 years.

If anything, this data would argue for something like an 80-year cycle. However … lets not be hasty. There’s more to come.

Here’s the next oddity. As mentioned above, the Armagh and CET periodograms have neatly aligned peaks and valleys over much of their lengths. And the Berkeley Earth periodogram looks at first blush to be quite similar as well. But Figure 2 reveals the oddity:

Figure 2. As in Figure 1 (without significance information). Black lines connect the peaks and valleys of the Berkeley Earth and CET periodograms. As above, the length of each periodogram is half the length of the dataset.

The peaks and valleys of CET and Armagh line up one right above the other. But that’s not true about CET and Berkeley Earth. They fan out. Again, I’m pretty sure I don’t know why. It may be a subtle effect of the Berkeley Earth processing algorithm, I don’t know.

However, despite that, I’m quite impressed by the similarity between the station, local area, and global periodograms. The HadCRUT dataset is clearly the odd man out.

Next, I looked at the differences between the first and second halves of the individual datasets. Figure 3 shows that result for the Armagh dataset. As a well-documented single-station record, presumably this is the cleanest and most internally consistent dataset of the four.

Figure 3. The periodogram of the full Armagh dataset, as well as of the first and second halves of that dataset.

This is a perfect example of why I pay little attention to purported cycles in the climate datasets. In the first half of the Armagh data, which covers a hundred years, there are strong cycles centered on 23 and 38 years, and almost no power at 28 years.

In the second half of the data, both the strong cycles disappear, as does the lack of power at 28 years. They are replaced by a pair of much smaller peaks at 21 and 29 years, with a minimum at 35 years … go figure.

And remember, the 24 and 38 year periods persisted for about four and about three full periods respectively in the 104-year half-datasets … they persisted for 100 years, and then disappeared. How can one say anything about long-term cycles in a system like that?

Of course, having seen that odd result, I had to look at the same analysis for the CET data. Figure 4 shows those periodograms.

Figure 4. The periodogram of the full CET dataset, and the first and second halves of that dataset.

Again, this supports my contention that looking for regular cycles in climate data is a fools errand. Compare the first half of the CET data with the first half of the Armagh data. Both contain significant peaks at 23 and 38 years, with a pronounced v-shaped valley between.

Now look at the second half of each dataset. Each has four very small peaks, at 11, 13, 21, and 27 years, followed by a rising section to the end. The similarity in the cycles of both the full and half datasets from Armagh and the CET, which are two totally independent records, indicates that the cycles which are appearing and disappearing synchronously are real. They are not just random fluctuations in the aether. In that part of the planet, the green and lovely British Isles, in the 19th century there was a strong ~22 year cycle. A hundred years, that’s about five full periods at 22 years per cycle. You’d think after that amount of time you could depend on that … but nooo, in the next hundred years there’s no sign of the pesky 22-year period. It has sunk back into the depths of the fractal ocean without a trace …

One other two-hundred year dataset is shown in Figure 1. Here’s the same analysis using that data, from Berkeley Earth. I have trimmed it to the 1796-2002 common period of the CET and Armagh.

Figure 5. The SFT periodogram of the full Berkeley Earth dataset, and the first and second halves of that dataset.

Dang, would you look at that? That’s nothing but pretty. In the first half of the data, once again we see the same two peaks, this time at 24 and 36 years. And just like the CET, there is no sign of the 24-year peak in the second hundred years. It has vanished, just like in the individual datasets. In Figure 6 I summarize the first and second halves of the three datasets shown in Figs. 3-5, so you can see what I mean about the similarities in the timing of the peaks and valleys:

Figure 6. SFT Periodograms of the first and second halves of the three 208-year datasets. Top row is Berkeley Earth, middle row is CET, and bottom row is Armagh Observatory.

So this is an even further confirmation of both the reality of the ~23-year cycle in the first half of the data … as well as the reality of the total disappearance of the ~23-year cycle in the last half of the data. The similarity of these three datasets is a bit of a shock to me, as they range from an individual station to a global average.

So that’s the story of the SFT, the slow Fourier transform. The conclusion is not hard to draw. Don’t bother trying to capture temperature cycles in the wild, those jokers have been taking lessons from the Cheshire Cat. You can watch a strong cycle go up and down for a hundred years. Then just when you think you’ve caught it and corralled it and identified it, and you have it all caged and fenced about with numbers and causes and explanation, you turn your back for a few seconds, and when you turn round again, it has faded out completely, and some other cycle has taken its place.

Despite that, I do believe that this tool, the slow Fourier transform, should provide me with many hours of entertainment …

My best wishes to all,

As Usual, Gotta Say It: Please, if you disagree with me (and yes, unbelievably, that has actually happened in the past), I ask you to have the courtesy to quote the exact words that you disagree with. It lets us all understand just what you think is wrong.

Statistical Significance: As I stated above, I used a 90% level of significance in coloring the significant data. This was for a simple reason. If I use a 95% significance threshold, almost none of the cycles are statistically significant. However, as the above graphs show, the agreement not only between the three independent datasets but between the individual halves of the datasets is strong evidence that we are dealing with real cycles … well, real disappearing cycles, but when they are present they are undoubtedly real. As a result, I reduced the significance threshold to 90% to indicate at least a relative level of statistical significance. Since I maintained that same threshold throughout, it allows us to make distinctions of relative significance based on a uniform metric.

Alternatively, you could argue for the higher 95% significance threshold, and say that this shows that there are almost no significant cycles in the temperature data … I’m easy with either one.

Data and Code: All the data and code used to do the analysis and make these graphics is in a 1.5 Mb zipped folder called “Slow Fourier Transform“. If you change your R directory to that folder it should all work. The file “sea level cycles.R” is the main file. It contains piles of code for this and the last two posts on tidal cycles. The section on temperature (this post) starts at about line 450. Some code on this planet is user-friendly. This code is user-aggressive. Things are not necessarily in order. It’s not designed to be run top to bottom. Persevere, I’ll answer questions.

5 1 vote

Article Rating

176 Comments

Inline Feedbacks

View all comments

Willis Eschenbach

Author

May 4, 2014 8:33 am

Nick Stokes says:
May 4, 2014 at 12:36 am

BTW, I wrote the R code that you linked to.

Sorry, Nick, I didn’t see your name on the post so I didn’t credit you. I’ve changed the head post to reflect your contribution, much appreciated.
w.

Willis Eschenbach

Author

May 4, 2014 8:35 am

Alex says:
May 4, 2014 at 7:33 am

Hi Willis
There seem to be two alexs on this site . I’m Alex with a capital A – I’m the nice one. I don’t denigrate, I just ask questions or make statements politely.

Thanks, Capital A Alex, noted.
w.

May 4, 2014 9:03 am

But Willis, you are wandering into the frequency domain. You are calculating the amplitude and phase of various components in the signal and you are making an implicit assumption about the structure of the signal in the frequency domain. In this case, one has to understand the relationship between sampling statistics and computation of the Fourier Integral.
This is a fundamental property of the signal and it doesn’t matter how you try to estimate the frequency (or period) and phase, one is dealing with a signal that is corrupted by irregular sampling. For example, when you have established the Fourier coefficients, using these for interpolation will give errors.
The problem with non-periodic sampling is that it forces corelation between components that are othogonal. If you express the the DFT in matrix form, it has orthogonal eigenvectors. When the sampling is irregular, the eigenvectors are non-orthogonal. To put this another way, when one reconstructs (interpolates) a signal that is regularly sampled, one can treat the process as a convolution with sin(wt)/t. This has the property in regular sampling of being zero at each sample and so the original signal is unaffected and the interpolated samples are a weighted sum of, in principle, the infinite sequence of samples in the signal. This cannot be done for an irregular signal as one cannot assign a frequency,w, to the interpolating function.
This is a bit of an arm waving explanation but the point I am making is that there exists an underlying theory to relate irregular sampling an its spectrum.
The other question is why not perform the calculation using simple quadrature, i.e.: direct integration of Fourier integral?.

ferdberple

May 4, 2014 9:16 am

Don’t bother trying to capture temperature cycles in the wild, those jokers have been taking lessons from the Cheshire Cat.
=========================
This shows how cycles can imitate the Cheshire Cat:
http://www.animations.physics.unsw.edu.au/jw/beats.htm

Jeffrey

May 4, 2014 9:23 am

In performing a Fourier analysis to look for periodicities, like you are doing, it is best to first detrend the data. A trend can be viewed as either non-periodic or a periodic term that has a period longer than the duration of the data time series. Without removal of a trend, you will end up mapping a (possibly messy) continuum of Fourier power into the power of the periodic ingredients that might actually be present in the data. Also, I would call the discrete fitting of individual Fourier terms to a data time series a “periodogram”.

May 4, 2014 9:26 am

@Nick Stokes
You are of course correct. I didn’t express myself very well. What I was getting at was that irregular sampling biases a minimisation method. (Consider an irregularly sampled square wave). The errors will obviously depend on the structure of the signal and, I would guess, become more pronounced at higher frequencies.
A more general point is whether FT methods are the best for spectral estimation. While one thinks of the DFT as generating a spectrum, which it does with some limitations, its importance is in manipulation, via convolution etc, rather than spectral estimation per se. I would imagine that other methods of spectral estimation, such as minimum entropy or eigenvector decomposition of the auto-correlation matrix might be more stable with this sort of data.

charplum

May 4, 2014 9:32 am

Perhaps, I don’t have much to offer here but a while ago I tried a different tact. I analyzed the Hadcrut4 data and simply tried a curve fitting procedure that I used to characterize magnetic properties (BH curves). The model equation I used in those cases was a high order polynomial divided by another high order polynomial. It worked quite well.
In this instance I used the fit equation of multiple sinusoids:
I got the following results.
https://onedrive.live.com/redir?resid=A14244340288E543!4393&authkey=!AKZvfIj0HHrcjMw&ithint=folder%2c
With only roughly half of a cycle I will not vouch for the validity of the cycle with the long period. However, I clearly think that there is an approximate 60 year cycle. To me it passes the eyeball test. At one time I was showing a 10.7 year cycle but when I read Willis’s earlier post about the solar cycle I had to agree its magnitude was negligible. I then tried to find a second long period cycle that might work. It is there but the magnitude is also negligible.

Greg

May 4, 2014 9:46 am

Nick Stokes says:
That’s my point there. Expand the cos and you get a lin comb of sin and cos transform. The latter are all you need.
Indeed, if there is both phase and amplitude param in the cosine at each freq , it is entirely equivalent to the full FT. I beg your pardon.

rgbatduke

May 4, 2014 9:50 am

It’s clearly “I agree with Nick” day — again, his observations are dead on, with a hat tip to Greg as well for correctly giving the equation of a Fourier transform (within normalization). To reiterate, given any function $f(x), x \in (-\infty,\infty):$ latex \~f(\omega) = \frac{1}{\sqrt{2}} \int_{-\infty}^{\infty) f(x) e^{-i\omega x} dx$
and:
$f(x) = \frac{1}{\sqrt{2}} \int_{-\infty}^{\infty} \~f(\omega) e^{i\omega x} d\omega$
Sine and cosine transforms are just the imaginary and/or real parts of this.
The FT has many interesting properties other than orthogonality (note the forward and backward transformation are symmetric) and the relation of that orthogonality to the Dirac delta function, which is not really a function but the limit of a distribution. One of them is what the transform of a function with compact support — even a pure sinusoidal harmonic function produces a finite width around the single frequency represented if you FT it on a finite interval, and on another thread I pointed out that what is called “pressure broadening” produces a Lorentzian line shape in radiation physics because collisions randomly interrupt the phase of the emitted wave, effectively doing the same thing but with many possible data intervals smeared out over a range.
If anybody cares about the kinds of “ringing” artifacts caused by domain truncation, Wikipedia has a whole article on them with numerous links:
http://en.wikipedia.org/wiki/Ringing_artifacts
They are relevant to lots and lots of things — measureable physical phenomena arise because waves in material are like harmonic waves in some cases and so interesting things happen at sharp boundaries. There are connections between this and the Heisenberg uncertainty principle — a single frequency spatial wave must have non-compact support, and the shorter you make its support relative to its wavelength, the broader its fourier transform, which is basically saying that the more you localize it, the less certain is its momentum in QM. There are connections to data compression (especially JPEG image compression). There are connections to data processing. And as Nick said, even with a long train of data, one still has to deal with the numerical issues of doing the integral if one isn’t integrating an analytic function that can be done with a piece of paper and pen. Quadrature is subtle, in part because quadrature itself always is fitting a smooth function to data and evaluating the integral of the smooth function, and there is no way to know if the functional relationship that the data is sampling is actually smooth and interpolable in between. One can always find functions that fool any give quadrature program, and in the end if one has pure data without any other knowledge (so one cannot hand wave an expectation of smoothness at some level) one is basically crossing one’s fingers and hoping when one applies a quadrature algorithm to evaluate its integral. Many of these problems are apropos the long-running discussion of evaluating global average of any sort from sparse data.
You might also want to look into wavelet transforms:
http://en.wikipedia.org/wiki/Wavelet_transform
and links therein. Wavelets are one of several approaches to dealing with FT artifacts for datasets with compact support. Basically (if I understand correct, which I may not:-) they do something “like” a FT but within a form-factor “window” of some finite width to effectively remove artifacts like ringing from the result within encapsulated frequency ranges. Note well the connection to the ‘uncertainty principle’ associated with FTs — FTs contain an intrinsic no-free-lunch behavior between the frequency and time domain and wavelets are an attempt to get a “minimum collective uncertainty” decomposition rather than one that minimizes one kind of uncertainty at the expense of the other. Lots of other good links here too.
rgb

DonS

May 4, 2014 9:53 am

@ur momisugly Alex says:
May 4, 2014 at 7:33 am
So, then you are not the alex who can’t spell “explicitly”. Good.

rgbatduke

May 4, 2014 9:54 am

OK, so screw me, once again WordPress ate my (equation) homework. Sigh. Instead of trying to fix it I’ll just post:
http://en.wikipedia.org/wiki/Fourier_transform#Other_conventions
Again, a wealth of information. Tons of really good books on the subject, too.

Chuck

May 4, 2014 10:05 am

The Fourier Transform goes back to 1822.
And, IIRC, the FFT goes back to Gauss’ Theory of the motion of the heavenly bodies moving about the sun in conic sections, published in 1809, which is even earliar 😉 I think he was using it to fit trigonometric series. Gauss was a very, very, very smart guy.
Re: missing points or unevenly spaced samples. There are various methods out there, including interpolating between sample points. I don’t know what is best practice these days.

Ric Werme

Editor

May 4, 2014 10:15 am

This would be a good time to review a couple of my WUWT favorites with 2,500 year records from tree rings, see
http://wattsupwiththat.com/2011/12/07/in-china-there-are-no-hockey-sticks/ which shows the big cycles are 110, 199, 800, 1324 years long. It does show others, e.g. 18 (Saros cycle?), 24 (close to your 20s), 30, and 37.
http://wattsupwiththat.com/2012/06/17/manns-hockey-stick-refuted-10-years-before-it-was-published/ which stays in the time domain.
The Chinese paper claims the record is long enough for forecasting, and then forecasts cooling though 2068. I’d like to see that work replicated….

GlynnMhor

May 4, 2014 10:19 am

As David and Andres point out, above, the real world system does not always have ‘pure’ cycles with a consistent wavelength that the Fourier transform would highlight.
If a cycle has a period of 50 years one time, then 65 years, then 60 years, it will not appear in the frequency spectrum as a strong peak.
For example, while one can see in the temperature data alternating periods of roughly three decades of warming and cooling superimposed on the data, periods that appear to be connected to the PDO and the changing dominance of nino-over-nina conditions versus nina-over-nino, that does not recur at precisely sixty year intervals.
So I’m not sure that the FT method is of great value in ascertaining what cyclicity exists or does not exist in the real world.

Sensorman

May 4, 2014 10:19 am

Willis – re the difference between half-data sets – what would be fascinating to see (but lots more work) would be the equivalent of a spectrogram plot (with your cycle length x-axis rotated to be vertical, amplitude represented by color/intensity) and process sequential 104-year blocks stepped in time by 1 year increments to obtain an x-axis. This would reveal whether any particular cycle length tends to appear, disappear, reappear over time. Generally easy to do with regular sampled data series & FFT, but may be just too painful using “SFT”…

Chuck

May 4, 2014 10:20 am

The Fourier Transform goes back to 1822.
And the FFT to 1805, Carl Friedrich Gauss, “Theoria interpolationis methodo nova tractata”, which is even earlier 😉

ferdberple

May 4, 2014 10:35 am

Gunnar Strandell says:
May 4, 2014 at 12:55 am
My favorite is the chaos pendulum that is 100% deterministic but still has a chaotic motion that is unpredictable even if we know the energy content.
===============
very interesting result. like temperature. the motion is bounded, but impossible to calculate where it will be in the future. might be high, might be low, without any change in forcings. at infinity the average is perhaps meaningful, but along the way it is the variance that rules the results.
http://www.myphysicslab.com/dbl_pendulum.html

Steven Mosher

May 4, 2014 10:37 am

“According to your Fig.1, Berkeley Earth has global average temperatures starting from 1752… Sorry, but suspension of disbelief goes only so far.”
one can, and we do, calculate our expectation for the entire field given the observations at that time.
Its pretty simple. However the errors bars are large. That said, one can predict, and we do, what the average temperature would be for all unsampled locations at that time. This estimate uses all the data
available, and assumes that the correlation structure is constant over time.

Bernie Hutchins

May 4, 2014 10:50 am

More gas to the fire! Has anyone tried Prony’s method (1795!) for this problem?
You are trying to fit data to a form:
x(t) = Sum over k { Bk e^(-sigmak t) cos (omegak t + thetak) }
thus just a sum of decaying sinusoidal waveforms (much like human speech). It at first appears that one only needs 4 data points for each k (four equations in 4 unknowns: B, sigma, omega, and fi). [Sigma=0 if no decay.] Two problems: (1) these are NOT linear equations and (2) noisy samples CAN spoil everything.
I have never discovered what Prony was doing with discrete series in 1795, but he transformed the non-linear problem to one of first finding the poles (the frequency and decay constants) in terms of a LINEAR DIFFERENCE equations (discrete-time resonator), to which initial conditions could then be applied, much as we learned to solve differential equations in Calculus 101 (general and particular parts to the problem).
With Prony, you only need tiny amounts of data. Four consecutive points DO give you one sinusoidal component. And it is then trivial to compute (and compare) the rest of the series. The result of course does NOT necessarily mean anything. It is quite possible that your initial assumptions that the signal being analyzed is composed of sinusoidal waveforms and/or that you guessed the right order, is wrong. Does your series represent the data outside the training? Often (usually?) not.
Noise is the second problem, but it can be attacked. I recently wrote up some comments on this:
http://electronotes.netfirms.com/EN221.pdf
and the first references there may be a better introduction. Things like solving a very large overdetermined set of equations (least squares pseudoinverse) and autocorrelation (Yule-Walker) are very useful.
Fun to try. Any connection to reality is, or course, tenuous!
Good review of Fourier stuff on this thread, and wavelets always should be considered.

May 4, 2014 10:58 am

Given the data are taken from nature and that feedbacks (positive and negative) exist in nature, an FT analysis may not tell one what they may think. The global biome, for example, responds immediately to Krakatoa-scale volcanic activity. Also – I’m not sure why Willis thinks he’s not working in both the time and frequency domains, but that is what [SF]FT does. When I read the first few paragraphs I knew I was looking at an FT analysis, and it also reminded me of early work with Lissajous and spectrum analysis work I did with swept filters and noise/wave generators back when things were analog.
Anyway – another very interesting article that sheds light on the complexities of climate analysis.

Willis Eschenbach

Author

May 4, 2014 10:59 am

RS says:
May 4, 2014 at 9:03 am

But Willis, you are wandering into the frequency domain. You are calculating the amplitude and phase of various components in the signal and you are making an implicit assumption about the structure of the signal in the frequency domain. In this case, one has to understand the relationship between sampling statistics and computation of the Fourier Integral.
This is a fundamental property of the signal and it doesn’t matter how you try to estimate the frequency (or period) and phase, one is dealing with a signal that is corrupted by irregular sampling. For example, when you have established the Fourier coefficients, using these for interpolation will give errors.

Thanks, RS. That’s helpful. I think the difference might best be understood by considering what I’m doing at a given cycle length. Let’s take 60 years as an example.
I have a chunk of data. It may or may not be either periodic or complete. I use an iterative fitting procedure to determine the phase and amplitude of the best fit of a sine wave with a period of sixty years.
Now, forget about our friend Joe Fourier for a moment. All I’m doing is determining the amplitude of the best fit.
So let’s suppose for a moment that our chunk of data is a 60-year sine wave with a bunch of random normal error added to it.
Now consider how much effect the removal of a portion of that 60-year signal plus error will have on the amplitude of the fitted sine wave … not much. The chopping out of chunks of the signal doesn’t affect the fit a whole lot.
Now, if the missing data is regular in nature, like you chopped out every 13th year of the signal, that will definitely affect the results. But in general, my algorithm is remarkably resistant to missing data.
That’s what I meant when I said that I’m not transforming a time-domain signal into the frequency domain and then back out again. I’m just looking at how well a sine wave matches the data.
Here’s an example showing what I mean. I’ve taken a 180-year signal which is a sine wave with a period of 30 years, plus gaussian noise with a standard deviation of 0.5. Then I altered that data in two ways. In one, I randomly removed 20% of the data. In the other, I removed all of the data for 1900 to 1902, as well as the data for 1917. Then I ran the slow Fourier algorithm on all three. Here are the results:

As you can see, even with 20% of the data missing, or with a couple of year-long sections missing entirely, there is little effect on the results,

The problem with non-periodic sampling is that it forces corelation between components that are othogonal. If you express the the DFT in matrix form, it has orthogonal eigenvectors. When the sampling is irregular, the eigenvectors are non-orthogonal. To put this another way, when one reconstructs (interpolates) a signal that is regularly sampled, one can treat the process as a convolution with sin(wt)/t. This has the property in regular sampling of being zero at each sample and so the original signal is unaffected and the interpolated samples are a weighted sum of, in principle, the infinite sequence of samples in the signal. This cannot be done for an irregular signal as one cannot assign a frequency,w, to the interpolating function.

Thanks. Yes, I understand that. However, I’m not trying to reconstruct a signal from a set of orthogonal (or non-orthogonal) components.

This is a bit of an arm waving explanation but the point I am making is that there exists an underlying theory to relate irregular sampling an its spectrum.

As my example above shows, i can knock out 20% of a signal and still get essentially the same periodogram. I’d be interested in what your underlying theory would say about that.

The other question is why not perform the calculation using simple quadrature, i.e.: direct integration of Fourier integral?.

I’d love to, but I don’t see how to do what you propose. Perhaps you could explain in a bit more detail.
w.

DocMartyn

May 4, 2014 11:01 am

Willis, you may like this.

Where sea ice is will affect ocean currents and ocean currents will affect where sea ice grows and shrinks.
The way I envision the system is that harmonic circulation patterns will form in some ice conditions and be destroyed in others.

Willis Eschenbach

Author

May 4, 2014 11:03 am

Jeffrey says:
May 4, 2014 at 9:23 am

In performing a Fourier analysis to look for periodicities, like you are doing, it is best to first detrend the data. A trend can be viewed as either non-periodic or a periodic term that has a period longer than the duration of the data time series. Without removal of a trend, you will end up mapping a (possibly messy) continuum of Fourier power into the power of the periodic ingredients that might actually be present in the data.

Thanks, Jeffrey. As the code shows, but also as I forgot to mention, all data series are detrended before analysis.
w.

Steven Mosher

May 4, 2014 11:27 am

“Jeffrey says:
May 4, 2014 at 9:23 am
In performing a Fourier analysis to look for periodicities, like you are doing, it is best to first detrend the data.”
YUP!
This is one of the tricks that folks can use to manufacture 60 year cycles.
How?
Well first off to “detrend” the data one is asserting that there is a ‘trend’ in the data. Well, the data
have no trend. The data are just the data. A trend comes about by a DECISION to apply a model to the
data. For example, one asserts that there is a linear trend and then that trend is removed.
But we know that many trends can be found in the data. we could fit a quadratic, we could fit a spline.
we can find many models that “fit” the data and ‘remove” the trend terms from the data. But the data as the data have no trend. Trend is a property of a model used to explain the data.
If you fiddle about there is no doubt that one can remove a “trend” and find a cycle in what’s left.
The trick is removing a well chosen ‘trend”. That is, if you cherry pick a different trend to ‘remove’
you can engineer out a 60 year cycle.
This is how scaffetta gets his 60 year cycle.
Say what? yes, the trick is in the detrend step. remember you can detrend with a linear trend or you
can pick some other trend to remove. The data have no trend, the model you fit to data has the trend term.
http://www.skepticalscience.com/Astronomical_cycles.html
##### key paragraph
“The data has been detrended assuming an underlying parabolic trend. The main 60 year cycle, due to the alignment of Jupiter and Saturn, shows up very clear, but there are more. In particular, he identifies a total of 10 cycles due to combination of planets motion and one due to the moon (fig. 6B in the paper). Of those cycles, only two more are considered significant, namely those with periods of 20 and 30 years.
Fascinating. But then, a few pages later, Scafetta writes:
“However, the meaning of the quadratic fit forecast should not be mistaken: indeed, alternative fitting functions can be adopted, they would equally well fit the data from 1850 to 2009 but may diverge during the 21st century.”
His warning is on the problem of extrapolation of the trend in the future, which he nonetheless shows. But this sentence made me think that it’s true, once we put physics aside, we’re free to use the trend we like; so why parabolic? I decided to take a closer look, and this turns out to be the beginning of the end.
#############
Bottom line.
You can produce the magical 60 year cycle by cherry picking the trend you choose to remove.
next, to do cycle analysis properly the “trend” has to be removed.
but, removal of the trend IS UNDERDETERMINED. That means nothing in the data tells you
what the underlying data generating process really is. Many models can fit the data.
Next most models ( Im talking statistical models) fit to climate data are NON PHYSICAL. that is
their functional form is wrong from a physics standpoint ( temperature cannot increase linearly over all time).
Cycle analysis after trend removal is pretty much doomed to failure unless one can show that the trend removed is in fact a unique feature of the data generating process or at best the cycle analysis is only true if the assumptions about the trend removal are valid. That is all the statistical tests are subject to a methodological uncertainty that is typically not calculated.
1. Assume the trend is linear ( back up this assumption with a statistical test)
2. adjust the data by removing it
3. find a cycle
4. Assert the existence of the cycle with 95% confidence
5. Dont tell people that step 1 has its own uncertainty. Dont tell people that you had a choice of
trends ( data generating models) to remove and that the methodological uncertainty is substantial.
Having said that it’s not “wrong” to remove a linear trend. Its an assumption. Its just a tool, a choice. There are other choices and so ones understanding is conditioned by that choice. And that choice is open. It is an assumption about the underlying process. It is applying a theory to the data to produce adjusted data. The other path to understanding is to create models of the underlying process. That would be a GCM.

Willis Eschenbach

Author

May 4, 2014 11:29 am

RS says:
May 4, 2014 at 9:26 am

@Nick Stokes
You are of course correct. I didn’t express myself very well. What I was getting at was that irregular sampling biases a minimisation method. (Consider an irregularly sampled square wave). The errors will obviously depend on the structure of the signal and, I would guess, become more pronounced at higher frequencies.

Thanks, RS. I love the web, your comments point me to further investigations. OK, let’s consider an irregularly sampled square wave. I’ve taken a 180-years series composed of six 30-year square waves. Then I randomly knocked out a full 40% of the data. Forty percent gone. Here are the results:

I’m not seeing the degradation of the results that you seem to be implying …
w.