Guest Post by Willis Eschenbach
I’ve developed a curious kind of algorithm that I’ve humorously called “slow Fourier analysis”, but which might be better described as a method for direct spectral analysis. The basic concept is quite simple. I fit a sine wave with a certain cycle length, say 19 months, to the dataset of interest, and I note the peak-to-peak amplitude of this best-fit sine wave. I repeat this process for the various cycle lengths of interest, and put that all together as a periodogram. The periodogram shows how large of a sine wave is the best fit for the data for each of the cycle lengths. This method has worked a treat, but yesterday, this odd method of mine handed me a big surprise.
Now, I have been doing these periodograms utilizing only the time periods inherent in the data. So if the data was monthly, I’ve looked at all possible integral lengths of monthly cycles, starting with two month, three month, four month, five month cycles, and so on up to cycles with periods as long as a third the number of months in the data.
And on the other hand, if the data was yearly, I’ve looked at all possible integral lengths of yearly cycles, starting with two year cycles, three year cycles, four years, and so on up to a third of the number of years in the data.
(I mistrust reports of the longer cycles, and any cycle with a period longer than a third of the dataset is not really worth calculating—it will fool you every time.)
So I was going along in blissful ignorance when an alert WUWT reader, Greg Goodman, pointed out that there was no reason to restrict myself to integral periods. He noted that I could subdivide the results and gain greater resolution. I was dismissive of his idea. I said that I thought you could do it, and that he was right and it was an interesting idea, but I said I thought you wouldn’t gain any real resolution by doing that. Hang on, let me get my exact words and the graphic I used in that thread to illustrate my point … here’s what I said:
In any case, I’m still not convinced that the procedure will give any real increase in information. Here’s the difference, for example, between monthly and annual data:
I think that if I increase the resolution on the sunspot periodogram while still using the annual data, it won’t look anything like the real monthly data … I’ll report back on that one.
Well … this is the promised “report back”. As you can see, both the monthly and the annual data show the peak at 11 years. The monthly data shows more detail, as it resolves the individual peaks, at 10, 11, and ~ 11.9 years. I thought that the test I outlined would be a good one. I’d see what happened when I sampled annual data at monthly intervals. So here’s what I did. I took the SIDC monthly sunspot data. For the annual data, I didn’t use the SIDC annual data. Instead, I averaged the SIDC monthly data to give me annual data that I knew was a true annual average of that particular monthly dataset. Figure 2 shows the monthly data in red, and the annually averaged monthly data in black.
Figure 2. Monthly SIDC sunspot data (red), and the same monthly SIDC data annually averaged (black).
So to be clear, the data shown as the black line is the annual average of monthly data shown as the red line
Next, I modified my periodogram-generating computer function to allow for fractional time sampling of the data.
Finally, I calculated three periodograms. One is the periodogram of the annually averaged data, shown in yellow/black below. The second is the periodogram of the underlying monthly data, shown in blue.
And finally, I calculated the third periodogram (shown in cyan) using the annually averaged data, but sampled on a monthly basis … Figures 3 and 4 show those results. I must confess, when I saw them my jaw hit the floor.
Figures 3 and 4. Upper panel shows the periodogram of annual sunspot data (yellow/black), monthly sunspot data (blue), and monthly-sampled annual data (cyan). Lower panel is a closeup of the upper panel, showing the period from seven to twenty-five years.
You can see why my jaw hit the floor. The blue line (periodogram of actual monthly data) is almost identical to the cyan line (periodogram of annual data sampled at monthly intervals). Who knew?
It turns out that contrary to my expectation, the information about the strength of the monthly cycles is NOT lost when the individual monthly values are subsumed into annual averages. Somehow, that information still exists and can be retrieved. When I take the annual data and analyze it at monthly frequencies, the data is still there. Not exactly, to be sure, as you can see the process is not perfect or precise.
But it is astoundingly accurate. I absolutely didn’t expect that.
Now, I suppose some signal analysis guy will pop up and say he knew that all along and abuse me for my colossal ignorance … but I taught myself this stuff, and all I can say is, it sure was a surprise to me.
Truly, I don’t understand this result. For many, many years I’ve thought that when you average data in chunks, like say averaging monthly data into yearly data, that all of the monthly information was gone. Lost. Kaput. Irretrievable.
But that doesn’t seem to be the case at all. It seems that very little of the power spectrum information was lost at all as a result of the annual averaging process.
Naturally, of course, this brings up the next question—is this procedure invertible? That is to say, given a periodogram such as the one above, can I run the process in reverse? Can I start from the annual data, calculate the monthly periodogram from that annual data, and then invert the periodogram to give me back the monthly data? That would be really amazing … but I’m pessimistic.
My guess is no, the periodogram can’t be inverted to me give back the monthly data … but given how poor my last pathetic guess (at the head of this post) was, I’ll certainly give it a try. Any assistance gladly accepted.
Like I said above … once again all I can say is, I’ll report back on that one.
Regards to all,
w.
Acknowledgement: My thanks to WUWT reader Greg Goodman for the suggestion to investigate fractional time periods.
For Clarity: If you disagree with something someone says, please quote the exact words you disagree with. It avoids all kinds of misunderstandings.
Data and Code: I’ve collected the R code, the R functions, and the data into a small (22KB) zipped folder called “Oversampling Folder“. It includes the monthly and annual sunspot data as CSV files. If you change your R workspace to the folder, it should be turnkey.
For the original data:
SIDC Monthly Sunspot Data Note that following the advice of Leif Svalgaard, I have increased all of the pre-1947 SIDC sunspot values by 20%, to correct for the change in counting methods at that time. The change is immaterial for this analysis.
Changes In The R Function: The function to generate the periodogram, named “sineptop” (for sine peak-to-peak), was previously called like this:
sineptop(annual_data, frequency = 1)
or like this:
sineptop(monthly_data,frequency = 12)
The “frequency” variable identifies the data as having that many periods per year.
It can still be called like that. In addition, however, the new sineptop syntax for e.g. monthly sampling an annual dataset looks like this:
sineptop(annual_data, frequency = 1, by = 1/12)
If the “by” variable is not specified it is assumed to be 1, so you don’t need to specify it, and the original syntax still works.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.



{ for a time series x with n elements and filter series f with m elements, circular convolution }
{ I typed this from memory… }
{ m should be odd }
{ if f[..] coefficients are set to 1/m we have a box-car filter }
m2:=m div 2;
for i:=0 to n-1 do
begin
for j:=-m2 to +m2 do xf[i]:=x[i]*f[(i+j) mod i];
end;
C I miss FORTRAN.
Another rather excellent thread Willis!
It certainly stirred more dim memories for me than many other articles have.
Grant Foster (Tamino): Thank you for dropping in and leaving a solid explanation!
At the danger of getting my head blown off:
The method of interpolation from a DFT is to expand the transform by adding zeros to the high frequency end of the spectrum, effectively increasing the sampling rate. (care has to be taken with end effects here). This is equivalent to unsing a filter that has a rectangular, wholly real frequency response.
It is easily shown that this is equivalent to convolving the signal with a sinc finction:
sin(x)/x
whose frequency is chosen so that pi samples of the sinc function is equal to the oringinal inter-sample period. This means that you will be adding zero to the actual samples but reconstructing the gaps. The sample itself is not altered because sinc(x) tends to 1 as x tends to zero but does not exist at this point. One can show that this is valid using distribution theory If you have an irregularly sampled signal, this method breaks down.
However, as regards inverting the “transform”,. if you have calculated the coefficients correctly, you should be able to reconstruct the original signal EXACTLY simply by summing the Fourier series or by inserting your calculated values into a DFT and inverting it. You should take care with this because the Fourier transform is represented with the DC component at zero and the sine/cosine components being represented 0.5 of their amplitudes at +- the frequency in question with the sine component at negative frequency being the complex conjugate of that at positive frequency. (Becuse cos(wt)=0.5(exp(jwt)+exp(-jwt) and sin(wt)=-0.5*j*(exp(jwt)-exp(-jwt)).
I feel that there may be problems in doing this because what you are doing is not a DFT at frequencies that at not an exact multiple of the fundamental frequency (i.e.: 1/signal length).
Michael D says: May 27, 2014 at 10:19 am
“It would more accurate to call it a subsampled boxcar convolution I suppose.”
Yes, that is how I think of it. The SSN is convolved with a gate function, then multiplied by a shah function (Dirac comb of delta functions), then Fourier transformed.
You can see this in the freq domain, and see where error comes in. The convolution becomes multiplication by a sinc function (FT of gate). This represents sampling inaccuracy on the year/month scale. Mult by shah becomes (approx) convolution by FT of shah, which is itself a shah. But it isn’t exactly a shah; it had been truncated, ie multiplied by another gate, this time the data length. So you get another sinc function – this time very broad in the freq domain, representing limited data error.
I see I’m just echoing
Nullius in Verba says: May 27, 2014 at 10:59 am
here
I know in signal processing you can retrieve information that has been “lost” by averaging, or is very slight SNR; even to the point of being indistinguishable from noise; so long as you know the period(s) length in which the data was encoded and enough data….so thinking about it I’m not too surprised. But this post is an excellent example of DIY science and enthusiasum!
RC Saumarez says:
May 27, 2014 at 2:06 pm
Yes, this is basically the same as you get with zero padding in the usual FFT for evenly spaced data. If you truncate the functional basis to reconstruct the monthly values from inverse transforming, you are essentially performing sinc interpolation. It doesn’t really provide any new information, just more pleasing to our eyes and thought patterns.
Nick Stokes says:
May 27, 2014 at 2:23 pm
I see I’m just echoing
Nullius in Verba says: May 27, 2014 at 10:59 am
here
…..
who was echoing …..
“Nullius in Verba ” roughly meaning : didn’t read of word.
@Bart
If you have a properly sampled signal, you have the information.and obviously Fourier interpolation doesn’t add more information in a strict sense. Padding a DFT is exactly equivalent to convolution with a sinc funct.
If you want to know, say, the amplitude of a peak that lies between samples and its position, this can be recovered. (subject to some constraints) and is therefore useful.
Merrick says:
May 27, 2014 at 3:36 am
I’ve said nothing of the sort. The advantages that my method has are:
1. It doesn’t care about missing data. In climate science this is a huge plus.
2. It is linear in period, making the interpretation simple.
3. It is easy to calculate the statistical significance of the results, including adjusting for autocorrelation.
4. It gives answers in the units of the underlying data (°C, mm, etc.).
5. It is computationally straightforward, and despite the name, certainly fast enough for the sizes of datasets commonly found in climate science.
6. It works, not just with missing data, but with irregularly spaced datasets. So if for example you have monthly data mixed in with by-weekly data, plus scattered hourly data, it doesn’t matter. It just uses the times of observations that it is given.
Other than that, it is, as I said in the head post, just another among the many methods of spectrum analysis, in this case a very direct sort of steam-powered, brute-force method that is easy to implement and understand.
w.
RC says: The method of interpolation from a DFT is to expand the transform by adding zeros to the high frequency end of the spectrum, effectively increasing the sampling rate. (care has to be taken with end effects here). This is equivalent to unsing a filter that has a rectangular, wholly real frequency response.
It is easily shown that this is equivalent to convolving the signal with a sinc finction:
sin(x)/x
====
The rectangular window is applied as soon as you take finite sample of continuous data and try to analyse it. it’s nothing to do with the zeroes.
If you just add zeros you could be increasing the discontinuities at the end. It’s often best to subtract the mean so that the ‘zeros’ are in fact padding with the mean.
Then there’s the question of a “taper function” to avoid the discontinuities (that is done before adding zeros). This scales the signal down to zero (hopefully the mean) at start and end. This itself distorts the data but is usually preferable to the gross effects of just wrapping the end on the start.
The whole idea of adding zeros is to get around the fixed intervals of DFT. It does not even apply to Willis code, where he is free to choose and period or interval he wishes.
Willis: “1. It doesn’t care about missing data. In climate science this is a huge plus.”
Does it care about missing heat , that would be definitely useful.
Michael D says:
May 27, 2014 at 10:32 am
Thanks, Michael. I started by using a double optimization method to adjust the phase and amplitude for the best-fit sine wave, but that was slow. In the end, I simply made up a cosine wave and a sine wave of the desired frequency that were the length of the dataset, both of them starting at t = 0.
Then I did a least-squares linear analysis of the form
a * coswave + b * sinwave + c = modeled value
and used the resulting fitted curve without getting into considerations of phase.
I was a happy man when I thought of that way around the optimization. I got the inspiration from Fourier’s original method, which (from memory from a while ago) used sine/cosine pairs rather than complex numbers, combined with a comment from a reader. Gotta love doing science online.
w.
Don’t you think it odd that we still use months as a timebase when they are of different length; so a 28 day month has the same information as a 31 day month.
justaknitter says:
May 27, 2014 at 10:48 am
You seem to be falling into the “I’ll provide advice instead of doing it myself” trap. This was from my last post, but your comment seems more related to that post, so I’ll repeat my words here, they’re apropos:
I don’t care about your theories about what I’m doing wrong, and about just how and where I should look for the 11-year cycle, justaknitter. That’s your job. Get back to me when you’ve actually knitted something solid following your own good advice, and have it in hand to show us …
w.
Nullius in Verba says:
May 27, 2014 at 10:59 am
Thanks, Nullius, your comment helped greatly.
w.
Nick Stokes says:
May 27, 2014 at 2:21 pm (Edit)
Interesting, Nick, and my thanks to you and all the others. Always more to learn.
w.
You seem to be falling into the “average kid trap”.
==============
agreed. the assumption underlying the analysis is that average temperature varies as average sunspot numbers. if so, then the cycle lengths should align.
But what if temperature and sunspot numbers both vary as cycle length? Shorter cycles, more spots, temps increases over time. In this case would you expect the cycle lengths to align?
what if temp = ratio of short cycles to long cycles. for example, take a period of roughly 220 years:
10 short and 10 long cycles = 10/10 = 1.0 temp (unchanged)
8 short and 12 long cycles = 8/12 = 0.67 temp (temp decrease)
12 short and 8 long cycles = 12/8 = 1.5 temp (temp increase)
Would you expect to see amplitude of temp cycle at the same frequency as average solar cycle length? I don’t see why it would, except by accident.
RC Saumarez said in part May 27, 2014 at 2:06 pm:
“…The method of interpolation from a DFT is to expand the transform by adding zeros to the high frequency end of the spectrum, effectively increasing the sampling rate. (care has to be taken with end effects here). …”
I pretty much agree. Thanks RC. But to be clear the “high end of the spectrum” is the MIDDLE of the DFT. In fact this is how I interpolated the three points in my graph posted above at 10:27 AM, repeated here:
http://electronotes.netfirms.com/Pdemo.jpg
I did a three point FFT X(k) of the three original time points x(n). Then I put 97 zeros in the middle:
XX(k) = [X(0), X(1), zeros(1,97), X(3)]
and took 100 times the real part of the inverse FFT of the now length-100 XX. (There is always a tiny imaginary roundoff in the inverse that needs to be discarded. )
This interpolated the three points to 100 time points to better show the higher frequency (1/3, rather than 1/360) that the FFT “had in mind”. So I had not been thinking of it this way (too busy calling the FFT a liar!), but perhaps the FFT frequency-domain interpolation “recovers” the frequencies lost to low-pass averaging. Neat.
[Incidentally, what if the FFT length is even? For example, if N = 8. In this cases, if things are properly bandlimited, the exact middle is X(4), half the sampling rate, and it should be zero. If not, you split the X(4) term, which must be real, in two between the padded zeros, half staying down, and half going up. Really – we can show this is true.]
Re zero padding
https://ccrma.stanford.edu/~jos/st/Zero_Padding_Applications.html
” Sometimes people will say that zero-padding in the time domain yields higher spectral resolution in the frequency domain. However, signal processing practitioners should not say that, because “resolution” in signal processing refers to the ability to “resolve” closely spaced features in a spectrum analysis (see Book IV [70] for details). The usual way to increase spectral resolution is to take a longer DFT without zero padding–i.e., look at more data. In the field of graphics, the term resolution refers to pixel density, so the common terminology confusion is reasonable. However, remember that in signal processing, zero-padding in one domain corresponds to a higher interpolation-density in the other domain–not a higher resolution. ”
consider the simple signal 1,1,1,1
now zero pad it to give twice the frequency resolution
1,1,1,1,0,0,0,0
If you are trying to tell me that the FFT of a DC signal is identical to that of a step function then something is very wrong. And a quick check says it ain’t. Even if you use two leading zeroes and two after it still ain’t. (OK I agree that if you detrend the original signal and then zero pad you get the ‘right’ answer, but B=1/T is a law, not an opinion, there is no more information in the sample than that)
So not very surprisingly we discover that there is no free lunch. Yes, you will increase your apparent frequency resolution but in doing so you will be altering the displayed spectrum in ways that you may not be able to predict.
As to the fake extra resolution you get by curve fitting a sine/cosine pair of arbitrary wavelength to a signal.
i agree that at first sight for instance an 11.9 Hz sine wave would fit an 11.9 Hz sine wave signal perfectly, even for very short sample lengths. But B=1/T says that in a .1 second sample, you can’t actually tell the difference between that and a signal that consists of a strong 12 Hz, and a weaker contribution from other frequencies. You are forcing the fit of the 11.9 and assuming that there are no other cycles of roughly the same frequency. That may be true for a particular system but is not true for a general signal.
WM Briggs advised: Do not smooth times series, you hockey puck! September 2008
There is an intrinsic relation between sinusoids and time invariance. A system is said to be time invariant if its response to a delayed excitation is the same as the response to the excitation delayed.
The time delay operator is a linear one (sum of two delayed functions is the sum delayed).
Now, eigenfunctions of the time delay operator are helices (screws), expressed as cos(wt)+j*sin(wt), that is, complex valued functions of the real variable t, their geometry is that of a screw. Projection of a screw to a plane, parallel to its axis is a sinusoid.
Multiplication by a complex number of unit absolute value is simply a rotation (in the complex plane).
The secret is that you just rotate the screwdriver, while the screw itself goes straight ahead in the plank; a really mysterious phenomenon, if you gave it a thought.
Filters are linear operators that commute with the time delay operator. If two linear operators commute, there is a base built of their common eigenvectors. This is why Fourier transform is useful in analysing filters.
For systems with no time delay symmetry Fourier transform does not make much sense.
sorry make that a 1 second sample, not .1
Poor Fourier must be rolling over in his grave. Few here comprehend that his series of harmonically-spaced, finite amplitude sinusoids provides a valid decomposition only for L- periodic signals, with 1/L as the fundamental frequency. Any other set of LS-fitted sinusoids fails to add up to the signal, especially when the underlying signal isn’t periodic. That’s what makes sinusoidal regression so devastating in the hands of novice signal analysts, captivated by its simplicity and other putative “advantages”.
@Bernie Hutchins
Those notes were very interesting. Thanks.
The effect of bringing the poles just inside the unit circle is a way of raising notches just off the floor in the FD. When doing the deconvolution in FD what I usually do is just add a small offset to ensure the division does not explode the noise at the those point.
Are you aware of a less arbitrary way of doing that? Or since it’s pretty arbitrary maybe that’s a good as anything.