Guest Post by Willis Eschenbach
While investigating the question of cycles in climate datasets (Part 1, Part 2), I invented a method I called “sinusoidal periodicity”. What I did was to fit a sine wave of various periods to the data, and record the amplitude of the best fit. I figured it had been invented before, so I asked people what I was doing and what its name was. I also asked if there was a faster way to do it, as my method does a lot of optimization (fitting) and thus is slow. An alert reader, Michael Gordon, pointed out that I was doing a type Fast Fourier Transform (FFT) … and provided a link to Nick Stokes’ R code to verify that indeed, my results are identical to the periodogram of his Fast Fourier Transform. So, it turns out that what I’ve invented can be best described as the “Slow Fourier Transform”, since it does exactly what the FFT does, only much slower … which sounds like bad news.
My great thanks to Michael, however, because actually I’m stoked to find out that I’m doing a Fourier transform. First, I greatly enjoy coming up with new ideas on my own and then finding out people have thought of them before me. Some folks might see that as a loss, finding out that someone thought of my invention or innovation before I did. But to me, that just means that my self-education is on the right track, and I’m coming up with valuable stuff. And in this case it also means that my results are a recognized quantity, a periodogram of the data. This is good news because people already understand what it is I’m showing.
Figure 1. Slow Fourier transform periodograms of four long-term surface air temperature datasets. Values are the peak-to-peak amplitude of the best-fit sine wave at each cycle length. The longest period shown in each panel is half the full length of the dataset. Top panel is Armagh Observatory in Ireland. The second panel is the Central England Temperature (CET), which is an average of three stations in central England. Third panel is the Berkeley Earth global temperature dataset. The fourth panel shows the HadCRUT4 global temperature dataset. Note that the units are in degrees C, and represent the peak-to-peak swings in temperature at each given cycle length. Data in color are significant after adjustment for autocorrelation at the 90% level. Significance is calculated after removing the monthly seasonal average variations.
I’m also overjoyed that my method gives identical results to its much speedier cousin, the Fast Fourier transform (FFT), because the Slow Fourier Transform (SFT) has a number of very significant advantages over the FFT. These advantages are particularly important in climate science.
The first big advantage is that the SFT is insensitive to gaps in the data. For example, the Brest tide data goes back to 1807, but there are some missing sections, e.g. from 1836-1846 and 1857-1860. As far as I know, the FFT cannot analyze the full length of the Brest data in one block, but that makes no difference to the SFT. It can utilize all of the data. As you can imagine, in climate science this is a very common issue, so this will allow people to greatly extend the usage of the Fourier transform.
The second big advantage is that the SFT can be used on an irregularly spaced time series. The FFT requires data that is uniformly spaced in time. But there’s a lot of valuable irregularly spaced climate data out there. The slow Fourier transform allows us to calculate the periodogram of the cycles in that irregular data, regardless of the timing of the observations. Even if all you have are observations scattered at various times throughout the year, with entire years missing and some years only having two observations while other years have two hundred observations … no matter. All that affects is the error of the results, it doesn’t prevent the calculation as it does with the FFT.
The third advantage is that the slow Fourier transform is explainable in layman’s terms. If you tell folks that you are transforming data from the time domain to the frequency domain, people’s eyes glaze over. But everyone understands the idea of e.g. a slow six-inch (150 mm) decade-long swing in the sea level, and that is what I am measuring directly and experimentally. Which me leads to …
… the fourth advantage, which is that the results are in the same units as the data. This means that a slow Fourier transform of tidal data gives answers in mm, and an SFT of temperature data (as in Figure 1) gives answers in °C. This allows for an intuitive understanding of the meaning of the results.
The final and largest advantage, however, is that the SFT method allows the calculation of the actual statistical significance of the results for each individual cycle length. The SFT involves fitting a sine wave to some time data. Once the phase and amplitude are optimized (fit) to the best value, we can use a standard least squares linear model to determine the p-value of the relationship between that sine wave and the data. In other words, this is not a theoretical calculation of the significance of the result. It is the actual p-value of the actual sine wave vis-a-vis the actual data at that particular cycle length. As a result, it automatically adjusts for the fact that some of the data may be missing. Note that I have adjusted for autocorrelation using the method of Nychka. In Figure 1 above, results that are significant at the 90% threshold are shown in color. See the note at the end for further discussion regarding significance.
Finally, before moving on, let me emphasize that I doubt if I’m the first person to come up with this method. All I claim is that I came up with it independently. If anyone knows of an earlier reference to the technique, please let me know.
So with that as prologue, let’s take a look at Figure 1, which I repeat here for ease of reference.
There are some interesting things and curious oddities about these results. First, note that we have three spatial scales involved. Armagh is a single station. The CET is a three-station average taken to be representative of the country. And the Berkeley Earth and HadCRUT4 data are global averages. Despite that, however, the cyclical swings in all four cases are on the order of 0.3 to 0.4°C … I’m pretty sure I don’t understand why that might be. Although I must say, it does have a certain pleasing fractal quality to it. It’s curious, however, that the cycles in an individual station should have the same amplitude as cycles as the global average data … but we have to follow the facts wherever they many lead us.
The next thing that I noticed about this graphic was the close correlation between the Armagh and the CET records. While these two areas are physically not all that far apart, they are on different islands, and one is a three-station average. Despite that, they both show peaks at 3, 7.8, 8.2, 11, 13, 14, 21, 24, 28, 34, and 42 years. The valleys between the peaks are also correlated. At about 50 years, however, they begin to diverge. Possibly this is random fluctuations, although the CET dropping to zero at 65 years would seem to rule that out.
I do note, however, that neither the Armagh nor the CET show the reputed 60-year period. In fact, none of the datasets show significant cycles at 60 years … go figure. Two of the four show peaks at 55 years … but both of them have larger peaks, one at 75 and one at 85 years. The other two (Armagh and HadCRUT) show nothing around 60 years.
If anything, this data would argue for something like an 80-year cycle. However … lets not be hasty. There’s more to come.
Here’s the next oddity. As mentioned above, the Armagh and CET periodograms have neatly aligned peaks and valleys over much of their lengths. And the Berkeley Earth periodogram looks at first blush to be quite similar as well. But Figure 2 reveals the oddity:
Figure 2. As in Figure 1 (without significance information). Black lines connect the peaks and valleys of the Berkeley Earth and CET periodograms. As above, the length of each periodogram is half the length of the dataset.
The peaks and valleys of CET and Armagh line up one right above the other. But that’s not true about CET and Berkeley Earth. They fan out. Again, I’m pretty sure I don’t know why. It may be a subtle effect of the Berkeley Earth processing algorithm, I don’t know.
However, despite that, I’m quite impressed by the similarity between the station, local area, and global periodograms. The HadCRUT dataset is clearly the odd man out.
Next, I looked at the differences between the first and second halves of the individual datasets. Figure 3 shows that result for the Armagh dataset. As a well-documented single-station record, presumably this is the cleanest and most internally consistent dataset of the four.
Figure 3. The periodogram of the full Armagh dataset, as well as of the first and second halves of that dataset.
This is a perfect example of why I pay little attention to purported cycles in the climate datasets. In the first half of the Armagh data, which covers a hundred years, there are strong cycles centered on 23 and 38 years, and almost no power at 28 years.
In the second half of the data, both the strong cycles disappear, as does the lack of power at 28 years. They are replaced by a pair of much smaller peaks at 21 and 29 years, with a minimum at 35 years … go figure.
And remember, the 24 and 38 year periods persisted for about four and about three full periods respectively in the 104-year half-datasets … they persisted for 100 years, and then disappeared. How can one say anything about long-term cycles in a system like that?
Of course, having seen that odd result, I had to look at the same analysis for the CET data. Figure 4 shows those periodograms.
Figure 4. The periodogram of the full CET dataset, and the first and second halves of that dataset.
Again, this supports my contention that looking for regular cycles in climate data is a fools errand. Compare the first half of the CET data with the first half of the Armagh data. Both contain significant peaks at 23 and 38 years, with a pronounced v-shaped valley between.
Now look at the second half of each dataset. Each has four very small peaks, at 11, 13, 21, and 27 years, followed by a rising section to the end. The similarity in the cycles of both the full and half datasets from Armagh and the CET, which are two totally independent records, indicates that the cycles which are appearing and disappearing synchronously are real. They are not just random fluctuations in the aether. In that part of the planet, the green and lovely British Isles, in the 19th century there was a strong ~22 year cycle. A hundred years, that’s about five full periods at 22 years per cycle. You’d think after that amount of time you could depend on that … but nooo, in the next hundred years there’s no sign of the pesky 22-year period. It has sunk back into the depths of the fractal ocean without a trace …
One other two-hundred year dataset is shown in Figure 1. Here’s the same analysis using that data, from Berkeley Earth. I have trimmed it to the 1796-2002 common period of the CET and Armagh.
Figure 5. The SFT periodogram of the full Berkeley Earth dataset, and the first and second halves of that dataset.
Dang, would you look at that? That’s nothing but pretty. In the first half of the data, once again we see the same two peaks, this time at 24 and 36 years. And just like the CET, there is no sign of the 24-year peak in the second hundred years. It has vanished, just like in the individual datasets. In Figure 6 I summarize the first and second halves of the three datasets shown in Figs. 3-5, so you can see what I mean about the similarities in the timing of the peaks and valleys:
Figure 6. SFT Periodograms of the first and second halves of the three 208-year datasets. Top row is Berkeley Earth, middle row is CET, and bottom row is Armagh Observatory.
So this is an even further confirmation of both the reality of the ~23-year cycle in the first half of the data … as well as the reality of the total disappearance of the ~23-year cycle in the last half of the data. The similarity of these three datasets is a bit of a shock to me, as they range from an individual station to a global average.
So that’s the story of the SFT, the slow Fourier transform. The conclusion is not hard to draw. Don’t bother trying to capture temperature cycles in the wild, those jokers have been taking lessons from the Cheshire Cat. You can watch a strong cycle go up and down for a hundred years. Then just when you think you’ve caught it and corralled it and identified it, and you have it all caged and fenced about with numbers and causes and explanation, you turn your back for a few seconds, and when you turn round again, it has faded out completely, and some other cycle has taken its place.
Despite that, I do believe that this tool, the slow Fourier transform, should provide me with many hours of entertainment …
My best wishes to all,
w.
As Usual, Gotta Say It: Please, if you disagree with me (and yes, unbelievably, that has actually happened in the past), I ask you to have the courtesy to quote the exact words that you disagree with. It lets us all understand just what you think is wrong.
Statistical Significance: As I stated above, I used a 90% level of significance in coloring the significant data. This was for a simple reason. If I use a 95% significance threshold, almost none of the cycles are statistically significant. However, as the above graphs show, the agreement not only between the three independent datasets but between the individual halves of the datasets is strong evidence that we are dealing with real cycles … well, real disappearing cycles, but when they are present they are undoubtedly real. As a result, I reduced the significance threshold to 90% to indicate at least a relative level of statistical significance. Since I maintained that same threshold throughout, it allows us to make distinctions of relative significance based on a uniform metric.
Alternatively, you could argue for the higher 95% significance threshold, and say that this shows that there are almost no significant cycles in the temperature data … I’m easy with either one.
Data and Code: All the data and code used to do the analysis and make these graphics is in a 1.5 Mb zipped folder called “Slow Fourier Transform“. If you change your R directory to that folder it should all work. The file “sea level cycles.R” is the main file. It contains piles of code for this and the last two posts on tidal cycles. The section on temperature (this post) starts at about line 450. Some code on this planet is user-friendly. This code is user-aggressive. Things are not necessarily in order. It’s not designed to be run top to bottom. Persevere, I’ll answer questions.
For everyone complaining that Willis has not discovered anything new here, or criticizing him for his lack of mathematics education… consider this…
Although many of you already knew about Fourier Transforms, I pretty much guarantee that none of you stumbled across them and developed them on your own, without having had them taught to you. Willis just did exactly that.
So yes, Willis may be behind in formal education compared to you complainers, but he’s clearly shown that he’s just flat-out smarter than you.
So… please keep your criticisms to yourselves. Most of us just don’t want to hear it.
Willis
SFT
Or Fourier Series? After the recent discussion on your previous post this will no doubt prove to be very useful in my own work.
But with regard to your own work here it’s all rather immaterial as the results seem pretty good.
For what it’s worth, I’ll probably do a similar “Fourier Series” approach later today. But I’ll probably do it via multilinear regression this needs to be fast and at least as fast as a simple DFT with similar “dimensions”. With an efficient method (R has got some) for solving large systems of linear equations your approach could be fast/faster. Anyways, you might have done this already in which case ignore.
Anton
… please keep your criticisms to yourselves. Most of us just don’t want to hear it.
Calm down, calm down…
I don’t think anyone is having a go. It seems like quite a courteous exchange to me. And it is a blog where nobody learns anything if we all just pat each other on the back.
Anton Eagle says:
May 5, 2014 at 2:49 pm
For everyone complaining that Willis has not discovered anything new here, or criticizing him for his lack of mathematics education… consider this…
Although many of you already knew about Fourier Transforms, I pretty much guarantee that none of you stumbled across them and developed them on your own, without having had them taught to you. Willis just did exactly that.
So yes, Willis may be behind in formal education compared to you complainers, but he’s clearly shown that he’s just flat-out smarter than you.
So… please keep your criticisms to yourselves. Most of us just don’t want to hear it.
Leave a Reply
Enter your comment here…
I didn’t know that – thought Willis had a PhD in physics, chemistry or mechanical engineering.
Anyway I suspect Fourier invented his series solution to help him * solve transient heat transfer
problems in conduction (see Fourier’s general equation).
* Certainly others such as Schack, H P Gurney and J Lurie applied Fourier series to this end.
chemengrls says:
May 6, 2014 at 2:29 am
Man, I wish. I’ve taken exactly two college science classes, Physics 101 and Chemistry 101. Oh, and calculus. I learned about Fourier from the Encyclopedia Britannica.
w.
I tell you the mods round here are a useful bunch…cheers for correcting my last comment.
Willis Eschenbach: “I’m using the lm() function in R. The R linear model function reports the statistics of the fit as a p-value. ”
Thanks for the response. I’m still looking for that output of lm() in my distribution of R, but I’m sure that some day some errant combination of keystrokes will overcome R’s inscrutable documentation and reveal that my version has had it all the time.
Willis,
With this comment:
“I knew some unpleasant person would come along with this lame accusation”
you seem to be acting hyper-sensitive to what was fairly mild criticism. I would also politely remind you that you may think my words unpleasant, which is your perogative, but as you don’t know me personally then to suggest I am an unpleasant person is not reasonable. Especially as I went to some trouble to write you several paragraphs on interesting places to visit in Southern England, including Salisbury Cathedral, a little while back.
Joe Born (May 6, 2014 at 4:47 am)
Try running the function summary() on the output of the lm() function.
As well, the function confint() can provide confidence intervals for all of the estimated coefficients.
E.g.:
reg = lm(y~x)
summary(reg)
confint(reg)
cd says:
May 6, 2014 at 4:20 am
That was me, I try to make folks’ contributions look right.
w.
Joe Born says:
May 6, 2014 at 4:47 am
Ah, my bad, sorry. You need to look at the summary of the lm results, like so:
summary(lm())
Use the “names” function to reveal the named arguments, comme ça ..
names(summary(lm()))
w.
RomanM and Willis Eschenbach:
Thanks a lot for the pointers. They worked for me.
ThinkingScientist says:
May 6, 2014 at 5:45 am
First, ThinkingScientist, let me say that you are right. The fact that you have acted in an unpleasant manner doesn’t make you an unpleasant person. That was a bridge too far. My apologies. As to your taking the trouble to write regarding Salisbury, thanks for that as well. I fear that with the literally hundreds of people I interact with on the web, who said what six months ago is often a blur.
To the topic at hand. You had said:
You made a most unpleasant accusation—that I was trying to take credit for something invented by another man. This is called “scientific plagiarism”, presenting a new method of analysis as if it were one’s own when someone else came up with the idea.
Now, that’s not a nice thing to accuse a scientist of, even on a good day.
But this is not my first rodeo. I’ve had warm, intelligent, sensitive folks like you try this particular unpleasantness before. So knowing what was likely coming. I specifically said in the head post that it was likely NOT a new method of analysis, and that while I came up with it independently I was likely NOT the first person to come up with the idea. (Although to date no one has come up with any prior art.)
To accuse me of plagiarism after reading my specific disclaimer is not just unpleasant. It is negligent as well.
Now, am I “hyper-sensitive to what was fairly mild criticism”? First off, an accusation of scientific plagiarism is hardly “fairly mild criticism” on my planet.
Second, I just had Dr. Roy Spencer unsuccessfully try to accuse me of the same thing. Fortunately, the web never forgets. I had to do with him what I’ve done with you, specifically object and use the web record to show conclusively that he was wrong. Tends to put one’s nerves on edge, though.
Third, I come into this battle armed with nothing but my reputation and my honor. I don’t have a PhD. I don’t have a string of publications as long as your arm. I don’t have colleagues and graduate students and co-workers. I don’t have awards from John Kerry’s wife like Hansen, or a No-Bull prize like Mann. I’m not the head of a department or a school. I work building houses.
I enlisted in the climate wars equipped with my honor and my insatiable curiosity and my cranial horsepower and little more. As a result, I’ve had to fight hard for every inch that I’ve gained. And yes, that’s made me sensitive when people try to impugn my honor as you did.
I never try to take credit for what another man has done, I want to give credit where credit is due. Look at this post. Nick Stokes pointed out that he’d written a post I referred to … so I went back in and edited this post to give him that credit. More to the point, I knew someone likely had invented the idea before, and I specifically stated that, and I asked people to name the names of whoever did invent this method. Accusing me of plagiarism after that is … well, unpleasant.
Finally, ThinkingScientist, I have a simple rule. I don’t accept moral or ethical instruction from someone who is unwilling to sign their own name to their own words. Call me crazy, but I won’t do it. I’m perfectly happy to take scientific instruction from such a person. Science is about the ideas and whether they are right or wrong.
But taking scientific instruction is the limit. If you want me to listen to your advice on more serious matters, you’ll have to sign your words. See, I have to defend the things that I said on the web, whether I said them last year or ten years ago.
But you? You can give me crappy advice, accuse me of plagiarism, get upset when I call you on it … and then change your name to GamerBabe37 and never have to take any responsibility for what you’ve said in the slightest.
So I’m sorry, ThinkingScientist, but I won’t take advice and admonition on ethics and honor from some random internet popup who is unwilling to take responsibility for his own words.
Note that I’m not saying that anonymity is bad. There are valid reasons for it. It’s a choice that people make, for a whole variety of different reasons.
It’s just that when you make that choice, when you slip on that mask, there are things that you give up. One of them is the right to lecture folks on their behavior. If you won’t take responsibility for your own behavior, why should others pay the slightest attention to your advice in that matter?
Best regards,
w.
Comments on this thread suggest that there is still much confusion about the various “Fourier Performers” in the arena. Not surprising as there are (at least) seven of them.
http://electronotes.netfirms.com/FourierMap.jpg
There are the two transforms that address systems more than signals (Laplace Transform and z-Transform). Then there is the very familiar Fourier Series – FS, and it’s “parent” the (continuous time) Fourier Transform – CTFT. (That’s four.) The FS has a “dual” that is often overlooked, the Discrete-Time Fourier Transform of DTFT. The DTFT is merely a FS with the roles of time and freq reversed. It relates a discrete time series (dual to discrete “harmonics” in the FS) to a corresponding now-periodic spectrum (AKA “aliasing”). The DTFT makes five, and the DTFT (note well the extra T) is DIFFERENT from the Discrete Fourier Transform (DFT) and the fast FFT where both time and frequency are both discrete and periodic (that’s all seven).
The “Fourier Map” in the link above summarizes multiple notions and relationships, and I have used it for 20 years.
Few of these transforms yield to closed form solutions. When computing from actual data, we usually need to trick the FFT into yielding a satisfactory approximation.
Perhaps visitors to the Fourier arena claiming to be new relatives need to be vetted against this family already there. I personally haven’t figured it all out.
I posted a comment earlier but after further consideration I realize I did not furnish enough background on my perspective and what I was trying to achieve.
I spent several decades trying to resolve issues with rotating equipment. In the earlier days we collected data and analyzed with swept frequency analysis making use of filters. The results were somewhat useful and also very noisy.
To me the introduction of the desktop FFT analyzer was roughly equivalent to the introduction of the HP 35 calculator (which cost approximately 1/2 months take home pay at the time). The FFT made visible to me the things that needed to be worked on with the objective to reduce vibration and make the equipment more reliable.
We had the luxury of acquiring raw data over an extended period of time so we did not suffer from having insufficient data. In these days we were simply doing detective work to find out what needed worked on. There was no such thing as a general computer model of the equipment. Rather, we were simply trying to find out what might need to be considered.
I did not perform the FFT analysis of the analogue tapes. My job was to make use of the analyzed data to look for explanations that would lead to remedies.
There are many comments on this article and I would be in the camp that there is insufficient data. That means plan B.
I have used the technique before but my approach to the detective work was simply to try and find an equation that might just fit the data. The validity of the results can be judged later but I simply was trying to detect natural cycles that might be present. The model equation consisted of 5 sinusoids.
Many readers may be way ahead of me and were already aware of what might be present. I was not so what I did was consistent with the approach I have used for years and that was simply to do some detective work so that I could understand what I was dealing with. It will be up to you to determine if anything I have done here is useful or insightful.
I inspected Hadcrut4 data on both a monthly basis and a yearly basis.
For the monthly data fitting the model equation to the data yielded the following:
I wish I knew how to make the graph available but I don’t. I judge it to be a good fit.
I did the same with the Hadcrut4 data on a yearly basis instead of the monthly basis. The results were:
Char Guess bmin bmax b Period
‘Amp 0.373928293335057 -1000 1000 0.456756591908584
‘Freq 0.00288184438040346 -1000 1000 0.0028356804964 352.649038298718
“Phase ” 1.65015934150811 -1000 1000 2.88817321720208
‘Amp 0.121904543009585 -1000 1000 0.112106012502226
‘Freq 0.0166666666666667 -1000 1000 0.0147134718840 67.9649241104327
“Phase ” 1 -1000 1000 23.1091565760669
‘Amp -0.0159596662864541 -1000 1000 0.0962192542383451
‘Freq 0.0061 -1000 1000 0.004934103435 202.671065392359
“Phase ” 1 -1000 1000 12.8395859438275
‘Amp 0.1 -1000 1000 -0.0253860619362965
‘Freq 0.25 -1000 1000 0.239294052343 4.17895885922845
“Phase ” 1 -1000 1000 91.0841625820464
‘Amp 0.1 -1000 1000 0.0475776733567028
‘Freq 0.04 -1000 1000 0.046203369505 21.6434431232757
“Phase ” 1 -1000 1000 -71.9060344118493
It took 681 iterations. The SSE was 44.037. The correlation coefficient was .861
The results for Hadcrut4 on a yearly basis.
Char Guess bmin bmax b Period
‘Amp 0.373928293335057 -1000 1000 0.5590459822568
‘Freq 0.00288184438040 -1000 1000 0.00287618100569 347.683264029118
‘Phase 1.65015934150811 -1000 1000 2.53417606588117
‘Amp 0.121904543009585 -1000 1000 0.111200199537111
‘Freq 0.0166666666666667 -1000 1000 0.014688221188 68.0817634172082
‘Phase 0.1 -1000 1000 23.4628856990436
‘Amp 0.01 -1000 1000 -0.202734098630665
‘Freq 0.0049 -1000 1000 0.004071935205 245.583475568171
‘Phase 1 -1000 1000 7.49413657611414
‘Amp 0.02 -1000 1000 0.0229749619103338
‘Freq 0.25 -1000 1000 0.239142083456 4.18161448435935
‘Phase -5 -1000 1000 191.081913392333
‘Amp 0.1 -1000 1000 0.0462677041430509
‘Freq 0.04 -1000 1000 0.0461238862996 21.6807402894059
‘Phase 1 -1000 1000 -70.7912168132773
It required 654 iterations. The resulting SSE was 1.35 and the correlation coefficient was 0.94.
I wish I knew how to include the graphs. I will learn that eventually.
There is some things in common between the two evaluations. There is a long period cycle of approximately 350 years. Is that real? It could also be a 1000 year cycle identified not too long ago from ice cores. In that same article I believe they identified something around 334 years. Could it be the same? I do not know.
Instead of a 60 year cycle both analyses show something in the mid 60s. There are also similarities in the 4 year and 21 year cycles. The one that appears different are the two in the 200 year time frame.
In the end all I have done is do something based upon my experience and it is an effort to get around the lack of a longer time record of temperatures. This is nothing more than initial detective work. I am just trying to figure out what I might be dealing with and have to explain.
Climate knowledge is a little out of my realm but if I think I can contribute something useful I will try. I am hard to embarrass and am willing to suffer embarrassment. You can learn from that too without getting hard feelings.
For the most part for the decades I worked on rotating equipment most of the innovations came from the insight we gained from test data. To me test data are gold. We are still trying to get a robust computer model so we won’t have to rely so much on test. Sort of reminds one of today’s computer climate models.
I wrote three sentences in my last post. The first was to frame my response and suggest you were (in my view) over-reacting. The second sentence was to suggest that if you felt my words were unpleasant, it was unfair to then suggest I was also unpleasant. You have responded by apologising and accepting it was a step too far. You did not need to apologise, but thank you anyway. Finally, I used an example where I had previously tried to help you.
But then you seem to have gone on to develop an argument spread over something like 14 paragraphs, developing a theme where you are saying I have accused you of plagiarism and then delivering what appears to be a moralising sermon, partly focussing on my anonymous moniker (which has remained unchanged for about 8 years or so).
Just for the record, I have not accused you of plagiarism and certainly did not intend anything of the sort. But I stand by my points that you are being hyper-sensitive to criticism and over-reacting and I think your response to my simple 3 sentences shows this.
But hey ho, if it makes you feel better to get it off your chest, fine.
Apologies, the post above was intended for Willis. I omitted to address it correctly.
ThinkingScientist says:
May 6, 2014 at 1:07 pm
I accept and am glad to hear that you did not intend to accuse me of plagiarism. Thank you for clarifying that.
However, you assuredly accused me of plagiarism when you said:
Despite not intending to do so, you have made a clear accusation—that I am implying that I invented something I didn’t invent. That is an accusation of scientific plagiarism. And yes, I’m damn sensitive to that, especially from someone unwilling to stand behind his own words.
In fact, I’m sensitive enough to specifically deny a couple times in the head post and my previous post that I was the first person to come up with the idea. When I first described the idea I wrote:
And in the head post I said:
Now, after I’ve specifically disowned the idea that I’m the inventor a couple times, you accuse me of writing “as though some new method of analysis were invented”?
And when I protest, you get all huffy and say it wasn’t an accusation of plagiarism? You accuse me of taking credit for some other man’s invention. Howe is that not an accusation of plagiarism?
And I stand by what I said. I don’t take advice on ethics, morality, or whether I’m “hyper-sensitive” from anonymous internet popups. I thought I’d made that clear.
w.
PS—You keep repeating that you only wrote “three simple sentences” inexculpation, as though it is not possible to make untrue accusations in that short a paragraph … despite having just proven that it is quite possible to do so.
I do not understand R-Code anywhere near well enough to tell if the SFT is a variation on, or even perhaps identical to something familiar in the Fourier arena. I do understand the standard Fourier ploys fairly well. As I suggested in a comment above, we do understand the Fourier Series, the Fourier Transform, and the FFT. The FFT exploits the even spacings in both time and frequency, which results in a periodicity in the exponential factors (colorfully called “twiddle factors”) such that an efficient algorithm can be run. Otherwise the relationship is N equations in N unknowns (slow – but usually speed is not important).
Moreover, the FS has a much-neglected dual cousin the Discrete-Time Fourier Transform (DTFT), rarely used except to find the frequency response of a filter from its impulse response, but which describes a periodic function of frequency in terms of discrete time samples. This is the DTFT, NOT the DFT=FFT. It can (being a frequency response) be solved for any frequency, equally spaced or clustered. Likewise, the FS of course gives time values for any choices of time. These calculations can be inverted easily but not fast. Once solved for arbitrary spacings in time and/or frequency, it of course gives the FFT answer for equal spacings.
I am guessing, and very possibly not helping any. I need to see equations, or Matlab.
Thanks, Bernie. Here’s what I’m doing.
For each cycle length, say 43 months, I fit a 43-month sine wave to the data. I note down the peak to peak amplitude of the resulting best-fit sine wave.
That’s it. That’s all I’m doing. Fitting a sine wave of each possible cycle length from 2 to datalength/2 to the data, and recording the amplitude. That’s what I’m calling the slow Fourier transform.
This may indeed be the same as the Discrete Time Fourier Transform. Sounds possible. I’ll need to do more research, which I do love to do. Thanks for the direction.
Hope that helps,
w.
Willis

I read your post where you were going to singularly look at how a frequency with a period of 43 months was applied.
I know I have more to learn more about posting comments here. In my last one I thought the message would be clear but it came out like garbage. So I shall try again.
In earlier postings you noted the lack of data and I would certainly agree. This is why I tried a curve fitting procedure that I have used since the mid 80s that was available with the DOS version of TKsolver. I tried to fit both the monthly and the yearly Hadcrut4 data with a curve that is composed of 5 sinusoids.
First the monthly data. I make an initial guess for amplitude, frequency and phase for each of the sine waves. The resulting answer is given in the b column.
Monthly
Char Guess b Period
‘Amp 0.3739 0.4568
‘Freq 0.0029 0.0028 352.6490
“Phase 1.6502 2.8882
‘Amp 0.1219 0.1121
‘Freq 0.0167 0.0147 67.9649
“Phase 1.0000 23.1092
‘Amp -0.0160 0.0962
‘Freq 0.0061 0.0049 202.6711
“Phase 1.0000 12.8396
‘Amp 0.1000 -0.0254
‘Freq 0.2500 0.2393 4.1790
“Phase 1.0000 91.0842
‘Amp 0.1000 0.0476
‘Freq 0.0400 0.0462 21.6434
“Phase 1.0000 -71.906
It took 681 iterations to come up with the above results.
The correlation coefficient was 0.8605
When I did the same for the yearly averaged data I got the following:
.Yearly
Char Guess b Period
‘Amp 0.3739 0.5590
‘Freq 0.0029 0.0029 347.6833
‘Phase 1.6502 2.5342
‘Amp 0.1219 0.1112
‘Freq 0.0167 0.0147 68.0818
‘Phase 0.1000 23.4629
‘Amp 0.0100 -0.2027
‘Freq 0.0049 0.0041 245.5835
‘Phase 1.0000 7.4941
‘Amp 0.0200 0.0230
‘Freq 0.2500 0.2391 4.1816
‘Phase -5.0000 191.0819
‘Amp 0.1000 0.0463
‘Freq 0.0400 0.0461 21.6807
‘Phase 1.0000 -70.7912
It took 654 iterations to come up with the above results.
The correlation coefficient was 0.942
I apologize that my last post was useless. Maybe this one will fall into the same category for different reasons. I was just doing detective work and trying to figure out if there were cycles in the raw data.
I wish I knew how to enter a figure but the first link is to the monthly figure and the second is to the yearly figure.
I think both figures pass the eyeball test. Both sets of data gave roughly the same result except for a wide difference between 202 years and 245. The cycle with the long period of around 350 might be similar to the 334 year cycle that was identified in the ice cores in an article published not long abo. In another post a bit further back a 350 year solar cycle seemed to show up.
In any case, I am trying to be helpful and not a bother.
My experience is rotating equipment not science and all I did here was what I would have done if I had minimal data to work with.
Thanks Willis –
That helps immensely. It appears to me that what you are doing is less of a Fourier transform of any flavor and more like a sweeping bandpass filter (a classical frequency-analysis method), or perhaps a “bank” of bandpass filters for which you then look for channels with significant responses (which is or can be similar to a DFT=FFT). I have to think more about the DTFT, but it does describe frequency responses of digital filters, so is relevant to any filter approach – or so it seems to me.
Further above I suggested “Prony’s Method” (1795!) which can be described as: “Here is some data – find the poles (resonances)”. Prony is not particular about having equally spaced data without gaps. It is really N equations in N unknowns (linear difference equations replacing non-linear trig equations). This could shorten a search for frequencies. But Prony doesn’t do well with noise. I just recently reviewed this:
http://electronotes.netfirms.com/EN221.pdf
If you look at this, the items in the references, 1b and 1c are a lot more basic.
Of course, many of these things do pretty much the same things, and are valid.
We are all still having fun – are we not? I enjoy all your posts.
Bernie
The pride of blog-lions gathered here seems to be blind to certain basics,
duly explicated in introductory texts on signal/system analysis, but often
neglected in DSP texts. Here is a brief “Idiot’s Guide” to them:
1. The FFT is a finite Fourier-series transform whose harmonic
analysis frequencies are set by the record length and the Nyquist frequency
is set by the signal sampling rate. Its line-spectrum output consists of
finite-amplitude sinusoids that orthogonally decompose the data series
EXACTLY, but always under the tacit ASSUMPTION that the N-point data series
d(n)is strictly N-periodic. In other words, it can be extended indefinitely
by the relationship f(n+kN)= d(n), for all integer k, to produce an
entirely predictable infinite data series f(m), where m = n + kN.
2. Even in the case of strictly periodic signals, whenever the
record-length is NOT equal to an integer multiple of the TRUE period the
FFT fails to identify the cardinal harmonics, smearing them into adjoining
frequencies. This has nothing to do with any physical phenomenon, such as
“line broadening,” being entirely an ARTIFACT of record length.
3. Zero-padding the data series merely changes the COMPUTATIONAL resolution
of the FFT, but adds not one bit of INFORMATION about the signal outside the
time-range of the record. Only in the case of time-limited signals, such as
the weights of an FIR filter, does such padding preserve without any
distortion whatever information is present in the data series. In all
other cases, it introduces a larger set of NEW sinusoidal components, such
that they all add up to the data series and whatever zeros were padded.
This is clearly a wholly unrealistic representation of usually aperiodic
geophysical signals that continue indefinitely before and after the
available record. Especially strongly distorted are the lowest frequencies,
corresponding to periods a large fraction of the record length.
4. The Fourier Integral has bilaterally infinite limits and an integrand
NECESSARILY incorporating BOTH cosine and sine terms. It converges,
however, ONLY in the case of transient signals, the integral of whose
ABSOLUTE VALUES is finite over the infinite time interval. That’s why
signal analysis relies upon the convergence factor of the Laplace Transform
for analytic specification of non-decaying CONTINUING signals that are
aperiodic. In either case, the transform is no longer discrete, but a
CONTINUUM of INFINITESIMAL sinusoids, which are mutually orthogonal ONLY
because the signal is specified over an infinite time interval.
5. The auto-correlation function (acf) of signals usually found in
nature provides a vital means to distinguish between periodic and random
FTA:
“You’d think after that amount of time you could depend on that … but nooo, in the next hundred years there’s no sign of the pesky 22-year period. It has sunk back into the depths of the fractal ocean without a trace …”
Willis, try this. Generate a series w(k) of “white” noise using your favorite random number generator. Feed it into a system model of the form
y(k) = 1.995*y(k-1) – 0.999*y(k-2) + w(k)
Plot y(k). What do you see? Something qualitatively like this?
If you are interested, I will be glad to discuss this further with you, and the implications for your question above.
Bart –
Am I missing something? It seems you just have a sharp 2nd-order bandpass with peak at frequency 0.01 so what you got is exactly what we expect. Isn’t it just ordinary bandpass filtered white noise?
http://electronotes.netfirms.com/Res.jpg
1sky1 –
I agree with points 1. to 4., AND with 5. except you need to say how you set a detection threshold for an autocorrelation peak. Obviously it has to be less than 100% of the zero delay value. I think you have to try different values (90%, 50%. etc.) and then argue your choice.