Guest Post by Willis Eschenbach
While investigating the question of cycles in climate datasets (Part 1, Part 2), I invented a method I called “sinusoidal periodicity”. What I did was to fit a sine wave of various periods to the data, and record the amplitude of the best fit. I figured it had been invented before, so I asked people what I was doing and what its name was. I also asked if there was a faster way to do it, as my method does a lot of optimization (fitting) and thus is slow. An alert reader, Michael Gordon, pointed out that I was doing a type Fast Fourier Transform (FFT) … and provided a link to Nick Stokes’ R code to verify that indeed, my results are identical to the periodogram of his Fast Fourier Transform. So, it turns out that what I’ve invented can be best described as the “Slow Fourier Transform”, since it does exactly what the FFT does, only much slower … which sounds like bad news.
My great thanks to Michael, however, because actually I’m stoked to find out that I’m doing a Fourier transform. First, I greatly enjoy coming up with new ideas on my own and then finding out people have thought of them before me. Some folks might see that as a loss, finding out that someone thought of my invention or innovation before I did. But to me, that just means that my self-education is on the right track, and I’m coming up with valuable stuff. And in this case it also means that my results are a recognized quantity, a periodogram of the data. This is good news because people already understand what it is I’m showing.
Figure 1. Slow Fourier transform periodograms of four long-term surface air temperature datasets. Values are the peak-to-peak amplitude of the best-fit sine wave at each cycle length. The longest period shown in each panel is half the full length of the dataset. Top panel is Armagh Observatory in Ireland. The second panel is the Central England Temperature (CET), which is an average of three stations in central England. Third panel is the Berkeley Earth global temperature dataset. The fourth panel shows the HadCRUT4 global temperature dataset. Note that the units are in degrees C, and represent the peak-to-peak swings in temperature at each given cycle length. Data in color are significant after adjustment for autocorrelation at the 90% level. Significance is calculated after removing the monthly seasonal average variations.
I’m also overjoyed that my method gives identical results to its much speedier cousin, the Fast Fourier transform (FFT), because the Slow Fourier Transform (SFT) has a number of very significant advantages over the FFT. These advantages are particularly important in climate science.
The first big advantage is that the SFT is insensitive to gaps in the data. For example, the Brest tide data goes back to 1807, but there are some missing sections, e.g. from 1836-1846 and 1857-1860. As far as I know, the FFT cannot analyze the full length of the Brest data in one block, but that makes no difference to the SFT. It can utilize all of the data. As you can imagine, in climate science this is a very common issue, so this will allow people to greatly extend the usage of the Fourier transform.
The second big advantage is that the SFT can be used on an irregularly spaced time series. The FFT requires data that is uniformly spaced in time. But there’s a lot of valuable irregularly spaced climate data out there. The slow Fourier transform allows us to calculate the periodogram of the cycles in that irregular data, regardless of the timing of the observations. Even if all you have are observations scattered at various times throughout the year, with entire years missing and some years only having two observations while other years have two hundred observations … no matter. All that affects is the error of the results, it doesn’t prevent the calculation as it does with the FFT.
The third advantage is that the slow Fourier transform is explainable in layman’s terms. If you tell folks that you are transforming data from the time domain to the frequency domain, people’s eyes glaze over. But everyone understands the idea of e.g. a slow six-inch (150 mm) decade-long swing in the sea level, and that is what I am measuring directly and experimentally. Which me leads to …
… the fourth advantage, which is that the results are in the same units as the data. This means that a slow Fourier transform of tidal data gives answers in mm, and an SFT of temperature data (as in Figure 1) gives answers in °C. This allows for an intuitive understanding of the meaning of the results.
The final and largest advantage, however, is that the SFT method allows the calculation of the actual statistical significance of the results for each individual cycle length. The SFT involves fitting a sine wave to some time data. Once the phase and amplitude are optimized (fit) to the best value, we can use a standard least squares linear model to determine the p-value of the relationship between that sine wave and the data. In other words, this is not a theoretical calculation of the significance of the result. It is the actual p-value of the actual sine wave vis-a-vis the actual data at that particular cycle length. As a result, it automatically adjusts for the fact that some of the data may be missing. Note that I have adjusted for autocorrelation using the method of Nychka. In Figure 1 above, results that are significant at the 90% threshold are shown in color. See the note at the end for further discussion regarding significance.
Finally, before moving on, let me emphasize that I doubt if I’m the first person to come up with this method. All I claim is that I came up with it independently. If anyone knows of an earlier reference to the technique, please let me know.
So with that as prologue, let’s take a look at Figure 1, which I repeat here for ease of reference.
There are some interesting things and curious oddities about these results. First, note that we have three spatial scales involved. Armagh is a single station. The CET is a three-station average taken to be representative of the country. And the Berkeley Earth and HadCRUT4 data are global averages. Despite that, however, the cyclical swings in all four cases are on the order of 0.3 to 0.4°C … I’m pretty sure I don’t understand why that might be. Although I must say, it does have a certain pleasing fractal quality to it. It’s curious, however, that the cycles in an individual station should have the same amplitude as cycles as the global average data … but we have to follow the facts wherever they many lead us.
The next thing that I noticed about this graphic was the close correlation between the Armagh and the CET records. While these two areas are physically not all that far apart, they are on different islands, and one is a three-station average. Despite that, they both show peaks at 3, 7.8, 8.2, 11, 13, 14, 21, 24, 28, 34, and 42 years. The valleys between the peaks are also correlated. At about 50 years, however, they begin to diverge. Possibly this is random fluctuations, although the CET dropping to zero at 65 years would seem to rule that out.
I do note, however, that neither the Armagh nor the CET show the reputed 60-year period. In fact, none of the datasets show significant cycles at 60 years … go figure. Two of the four show peaks at 55 years … but both of them have larger peaks, one at 75 and one at 85 years. The other two (Armagh and HadCRUT) show nothing around 60 years.
If anything, this data would argue for something like an 80-year cycle. However … lets not be hasty. There’s more to come.
Here’s the next oddity. As mentioned above, the Armagh and CET periodograms have neatly aligned peaks and valleys over much of their lengths. And the Berkeley Earth periodogram looks at first blush to be quite similar as well. But Figure 2 reveals the oddity:
Figure 2. As in Figure 1 (without significance information). Black lines connect the peaks and valleys of the Berkeley Earth and CET periodograms. As above, the length of each periodogram is half the length of the dataset.
The peaks and valleys of CET and Armagh line up one right above the other. But that’s not true about CET and Berkeley Earth. They fan out. Again, I’m pretty sure I don’t know why. It may be a subtle effect of the Berkeley Earth processing algorithm, I don’t know.
However, despite that, I’m quite impressed by the similarity between the station, local area, and global periodograms. The HadCRUT dataset is clearly the odd man out.
Next, I looked at the differences between the first and second halves of the individual datasets. Figure 3 shows that result for the Armagh dataset. As a well-documented single-station record, presumably this is the cleanest and most internally consistent dataset of the four.
Figure 3. The periodogram of the full Armagh dataset, as well as of the first and second halves of that dataset.
This is a perfect example of why I pay little attention to purported cycles in the climate datasets. In the first half of the Armagh data, which covers a hundred years, there are strong cycles centered on 23 and 38 years, and almost no power at 28 years.
In the second half of the data, both the strong cycles disappear, as does the lack of power at 28 years. They are replaced by a pair of much smaller peaks at 21 and 29 years, with a minimum at 35 years … go figure.
And remember, the 24 and 38 year periods persisted for about four and about three full periods respectively in the 104-year half-datasets … they persisted for 100 years, and then disappeared. How can one say anything about long-term cycles in a system like that?
Of course, having seen that odd result, I had to look at the same analysis for the CET data. Figure 4 shows those periodograms.
Figure 4. The periodogram of the full CET dataset, and the first and second halves of that dataset.
Again, this supports my contention that looking for regular cycles in climate data is a fools errand. Compare the first half of the CET data with the first half of the Armagh data. Both contain significant peaks at 23 and 38 years, with a pronounced v-shaped valley between.
Now look at the second half of each dataset. Each has four very small peaks, at 11, 13, 21, and 27 years, followed by a rising section to the end. The similarity in the cycles of both the full and half datasets from Armagh and the CET, which are two totally independent records, indicates that the cycles which are appearing and disappearing synchronously are real. They are not just random fluctuations in the aether. In that part of the planet, the green and lovely British Isles, in the 19th century there was a strong ~22 year cycle. A hundred years, that’s about five full periods at 22 years per cycle. You’d think after that amount of time you could depend on that … but nooo, in the next hundred years there’s no sign of the pesky 22-year period. It has sunk back into the depths of the fractal ocean without a trace …
One other two-hundred year dataset is shown in Figure 1. Here’s the same analysis using that data, from Berkeley Earth. I have trimmed it to the 1796-2002 common period of the CET and Armagh.
Figure 5. The SFT periodogram of the full Berkeley Earth dataset, and the first and second halves of that dataset.
Dang, would you look at that? That’s nothing but pretty. In the first half of the data, once again we see the same two peaks, this time at 24 and 36 years. And just like the CET, there is no sign of the 24-year peak in the second hundred years. It has vanished, just like in the individual datasets. In Figure 6 I summarize the first and second halves of the three datasets shown in Figs. 3-5, so you can see what I mean about the similarities in the timing of the peaks and valleys:
Figure 6. SFT Periodograms of the first and second halves of the three 208-year datasets. Top row is Berkeley Earth, middle row is CET, and bottom row is Armagh Observatory.
So this is an even further confirmation of both the reality of the ~23-year cycle in the first half of the data … as well as the reality of the total disappearance of the ~23-year cycle in the last half of the data. The similarity of these three datasets is a bit of a shock to me, as they range from an individual station to a global average.
So that’s the story of the SFT, the slow Fourier transform. The conclusion is not hard to draw. Don’t bother trying to capture temperature cycles in the wild, those jokers have been taking lessons from the Cheshire Cat. You can watch a strong cycle go up and down for a hundred years. Then just when you think you’ve caught it and corralled it and identified it, and you have it all caged and fenced about with numbers and causes and explanation, you turn your back for a few seconds, and when you turn round again, it has faded out completely, and some other cycle has taken its place.
Despite that, I do believe that this tool, the slow Fourier transform, should provide me with many hours of entertainment …
My best wishes to all,
w.
As Usual, Gotta Say It: Please, if you disagree with me (and yes, unbelievably, that has actually happened in the past), I ask you to have the courtesy to quote the exact words that you disagree with. It lets us all understand just what you think is wrong.
Statistical Significance: As I stated above, I used a 90% level of significance in coloring the significant data. This was for a simple reason. If I use a 95% significance threshold, almost none of the cycles are statistically significant. However, as the above graphs show, the agreement not only between the three independent datasets but between the individual halves of the datasets is strong evidence that we are dealing with real cycles … well, real disappearing cycles, but when they are present they are undoubtedly real. As a result, I reduced the significance threshold to 90% to indicate at least a relative level of statistical significance. Since I maintained that same threshold throughout, it allows us to make distinctions of relative significance based on a uniform metric.
Alternatively, you could argue for the higher 95% significance threshold, and say that this shows that there are almost no significant cycles in the temperature data … I’m easy with either one.
Data and Code: All the data and code used to do the analysis and make these graphics is in a 1.5 Mb zipped folder called “Slow Fourier Transform“. If you change your R directory to that folder it should all work. The file “sea level cycles.R” is the main file. It contains piles of code for this and the last two posts on tidal cycles. The section on temperature (this post) starts at about line 450. Some code on this planet is user-friendly. This code is user-aggressive. Things are not necessarily in order. It’s not designed to be run top to bottom. Persevere, I’ll answer questions.
The nomenclature that I have always understood as the standard:
DFT = discrete Fourier transform – relationship between (x0,….xN-1) complex values (time domain) and (X0,…XN-1) complex values (frequency domain).
FFT = algorithm for efficiently computing the DFT ie using the order of N logN operations, rather than N^2 operations.
I can’t wait for the flood of papers, in the newly ressucitated Pattern Recognition in Physics, using your method of SFT.
The Fourier Transform goes back to 1822. It takes the form:
F(ω) = ∫ f(t) cos(ωt+b) dt
where b is a phase. You can drop b; it’s enough to calculate a version with sin and cos to cover all phases (just use the sum formula for cos).
It’s an infinite integral, and works as a function. Supply any ω, and you can get F(ω). Nothing about a regular array. Numerically, though, you have to approximate. You can only have a finite interval, and evaluate f at a finite number of points. They don’t have to be equally spaced, but you need a good quadrature formula.
That’s basically what you are calling the SFT. In the continuum version, all trig functions are orthogonal, which means you can calculate parts of the spectrum independently. On a finite interval, generally only harmonics of the fundamental, with period equal to the length, are orthogonal to each other. And even then, only approximately, because of the inexact quadrature.
If you restrict to those harmonics, you have a Fourier series. If you use a quadrature with equally spaced intervals, the harmonics become orthogonal again (re summation), up to a number equal to the number of points. That is the Discrete Fourier Transform.
The Fast Fourier Transform is just an efficient way of doing the DFT arithmetic. It isn’t calculating something else. If you want to get a different set of frequencies from the DFT, you can use a longer interval, with padding. You’ll start from a lower fundamental, with the same HF limit, so the divisions are finer. With zero padding, you’re evaluating the same integrals, but the harmonics are orthogonal on the new interval, not the old. You’ll get artefacts from the cut-off to padding.
People use the FFT with padding because it quickly gives you a lot of finely spaced points, even if they’re not the ones you would ideally have chosen. It’s evaluating the same integrals as you are. You can interpolate the data for missing values – there is nothing better. You would do better to interpolate with the SFT too.
“the SFT method allows the calculation of the actual statistical significance of the results for each individual cycle length.”
Any of these methods gives an output which is just a weighted sum of the data. the significance test is the same for all.
BTW, I wrote the R code that you linked to.
Intriguing. The way the cycles of a given frequency fade up and down over time reminds me of nothing so much as when you listen carefully to a bell as the sound dies away. The individual frequencies (pitches) there also come and go as the energy surges from one mode of oscillation to another and back again. Given that the frequency components of the sound of a bell aren’t a neat series of exact number harmonics either, the analogy seems pretty good – although since the Earth’s climate has an external energy source to keep it “ringing”, perhaps a closer match would be a cross between a bell and a Chladni plate. A comparison with the Stadium Wave diagram might prove interesting, too.
Thank you Willis.
I think your results supports the idea of climate as a chaotic system, and that the behavior does not disappear when looking at global mean temperatures.
My favorite is the chaos pendulum that is 100% deterministic but still has a chaotic motion that is unpredictable even if we know the energy content.
So even if we exactly knew the energy balance of the earth we could not predict the temperature in the atmosphere.
I agree with Nick Stokes on this one. An FT is an FT and the FFT is just a fast algorithm for achieving the result.
Otherwise, still a very interesting article as usual.
Meta-cycles? Cycles of cycles, the way they come and go?
First, the “Fast” fourier transform is just one of a number of methods to achieve a transform from time-space to fequency-space. The FFT works, because the transform is a bit like a matrix – lots of well defined and related transforms which can be dramatically simplified. However in principle it is just a discrete fourier transform – which is a fourier transform with regular data on a “comb” pattern.
Willis, what you are dealing with is a discrete fourier transform on an irregular pattern. I would suggest being very careful, because the properties of a regular transform whilst not easy to understand, are at least predictable by the expert. In contrast, when you start having missing data and even irregularly spaced data, it gets increasingly complex to understand how much of what is seen is due to the way the data is presented and how much is an artefact of the actual original continuous signal before it was transformed to a discrete set of data points.
Every such transform adds its own signature to the final signal. (At least when viewed in the entirety of frequency space).
When you take random noise and restrict the frequency components by e.g. averaging, or having a limited set of data, you are very likely to get some form of “ringing”.
Ringing looks like some form of periodicity. And for information, the first paper to mention global warming was a paper trying to explain why the “camp century cycles” of temperature that had been spotted in the camp century ice cores had not predicted global cooling as the climate community expected.
The rise in CO2 was seen as a way to explain why the predictions based on – I think it was something like 128 year cycles had not occurred. That of course led to a new vogue of predicting the end of the world from CO2 — and the camp century cycle predictions of global cooling were hidden and many even deny such predictions were made.
The statistics of frequency-phase space
Your post raises a very interesting question which is this: how does one apply statistics to signals views in frequency-phase space. For a single frequency, the method I was taught was to look at the background profile of the noise – work out the signal to noise ratio and then use some statistics.
I therefore assume the probability of a signal not being noise (or random variation) for a signal with more than one frequency, is the average of the summation of this probability for all frequencies and phases.
Can anyone tell me if this is correct and/or what this kind of statistics is called so that I might finally find something someone else has written on it and know if I’m barking up the wrong tree.
In past I’ve done some work with the CET data. It is correct that it doesn’t show the 60 year periodicity. Analysing annual composite doesn’t revel anything particularly interesting, but analysing (from the 350 year record) two months around summer (June & July) and winter (December & January) solstices, the rest is just transition, it is found that they have distinct spectra.
Further more, it becomes obvious where and why the ’60 year’ (AMO) confusion arises, as shown
HERE
i.e the AMO spectrum (which clearly shows 60 year periodicity) is the envelope of 2 or 3 components clearly shown in the summer CET. Proximity of the CET to the AMO area would favour commonality of trends in the two.
“I’m also overjoyed that my method gives identical results to its much speedier cousin, the Fast Fourier transform (FFT), because the Slow Fourier Transform (SFT) has a number of very significant advantages over the FFT. These advantages are particularly important in climate science.”
……………………………
Should one really point to your own poor math education that explicitely?
Your “SFT” is known as DFT.
FFT does exactly the same thing DFT, just faster. There is no difference in the result.
Your “gaps” in data cannot be treated appropriately by any Fourier transform because of the Gibbs effect (look around what it is).
The right thing you might apply in this case might be wavelets.
Anyway a good idea would be to learn the basic math BEFORE you “publish” smth.
Even if it is just WUWT.
Just a nitpick but the CET record is not an average of the same three stations over the time period you are analyzing. It was compiled originally from multiple records:
“Manley1953) published a time series of monthly mean temperatures representative of central England for 1698-1952, followed (Manley 1974) by an extended and revised series for 1659-1973. Up to 1814 his data are based mainly on overlapping sequences of observations from a variety of carefully chosen and documented locations. Up to 1722, available instrumental records fail to overlap and Manley needs to use non-instrumental series for Utrecht compiled by Labrijn (1945), in order to mate the monthly central England temperature (CET) series complete.Between 1723 and the 1760s there are no gaps in the composite instrumental record, but the observations generally were taken in unheated rooms rather than with a truly outdoor exposure….”
Since 1879 it has been made up from a selection of stations that have changed over time.CET is also is a combination of records from different UK climate areas from wet and humid (Lancashire) to hotter and dry (Cambridgeshire). Up to 1974 the monthly CET values were based on Oxford and seven stations in the NW (Lancashire etc) but then, until 2004, they switched to Rothamsted, Malvern plis 0.5 of Ringway and Squires Gate – also included in the earlier seven. Then in 2004 they switched again to Rothamsted, Malvern and Stonyhurst equally weighted – Oxford? In 2007 they switched Pershore for Malvern. CET in the late 20th century does not agree with the official Meteo records for that area.
http://www.metoffice.gov.uk/hadobs/hadcet/ParkerHorton_CET_IJOC_2005.pdf
You criticized such compilations in your 2011 paper short-long-splice so what makes CET acceptable in your current paper and what impact does the splicing have?
http://wattsupwiththat.com/2011/11/11/short-splice-long-splice/#more-51027
Also, those who consider cycles to be present in the temperature record consider them to be a combination of multiple cycles with frequencies ranging from the 11 year solar cycle to the 1500 year (1300 to 1800 year) Bond cycle – how does this affect your results? Just asking!
Alex:
Please read Nick Stokes comment high above, he says the same but so much nicer.
Also, as i can’t see you show any errors in the post, why so unpleasant?
I think looking for normal variability is important and having read Eisenbachs contribution I look forward to yours.
A few random thoughts:
1. I think, don’t actually know, that the magic in the FFT is that it is very lightweight computationally and therefore can be done in real time. That’s useful in a lot of situations that have no relevance whatsoever to what you are doing. Your method should be fine for your purposes?
2. If I understand Mssr Fourier’s ideas correctly, any time series can be approximated as the sum of a bunch of sine waves with an appropriate amplitude, perod, and phase. Which suggests (to me at least) that any set of data can be analyzed and cycles found. The test is not whether cycles can be found. They WILL be. But whether they predict usefully outside the interval analyzed.
3. Because random noise will have a Fourier transform, and random noise could go anywhere in the next interval with each possible “future” having its own unique transform, that suggests to me that there are possibly multiple valid solutions for the Fourier transform of the current interval. I’ve always meant to look into that. If I’m right about that, then your solutions might not be correct, but not unique. THAT might be worth keeping in mind. Of course I could be fantasizing.
Anyway, good luck. Keep us posted.
“FFT does exactly the same thing DFT, just faster. There is no difference in the result.”
That is true. The difference is in finding out that you calculate the same values more than once, and avoiding it.
Willis, if you fit a sinewave and a cosine wave (both of zero phase) then you do not have to search for phase. The best fit phase is a linear combination of the sine and cosine fits.
I think that a simple sum of the square of the amplitudes will give you what you want. To get back to the same unit take the square root i.e. Result = sqrt( C^2 + S^2), C and S being the amplitudes of the best fit.
It should speed up you SFT if you don’t have to search for phase.
@Scottish Sceptic.
If the noise is Gaussian, then the distribution of the power spectrum is a chi squared distribution and each harmonic has a chi squared distribution with 2 degrees of freedom, i.e.: a negative exponential. This allows significance of spectral components to be tested. In non gausian signals, the diostribution of the power spectrum can often be calculated via the characteristic function of the distribution.
I’m not sure how you are “fitting” a sine/cosine. If you are simply calculating the intgral if the product of a function with a specific sine/cosine wave, then you are simply calculating a DFT.
However, you refer to “optimisation”, which, strictly, implies fitting a sine wave according to an objective function, i.e.:least squares or some other criterion. If you are doing this then you are definitely not calculating Fourier components because the processes are not mathematically equivalent.
I agree with Alex about gaps in the data. A sampled signal is created by multiplying a continuous signal by a set of Dirac functions. In the frequency domain this is a convolution between the spectrum of the sampling process and the signal spectrum. If the sampling is irregular, you will NOT calculate the the true spectrum of the signal.
See Oppenheim and Schafer: Digital signal processing, chapter 1.
Willis: “Again, this supports my contention that looking for regular cycles in climate data is a fools errand. ”
Indeed, anyone thinking that they will find stable, constant amplitude signals resulting from the multiple interacting systems that make up climate is a fool.
FFT, SFT, PA , whatever kind of way you do it you will only find CONSTANT AMPLITUDE signal this way because that’s all you are trying to fit.
That does not mean it’s useless but it needs some more skill to interpret what is shown.
Once again we need the trig identity:
cos(A)+cos(B) = 2*cos((B+A)/2)*cos((B-A)/2) eqn 1
(where cos(A) means cos (2*pi*t/A) if we want to talk in periods not frequency.)
With a little basic algebra this can expressed the other way around
cos(C)*cos(D) = cos((C+D))+cos((C-D)) eqn 2
That is also a special case where the amplitudes of each component are equal. It is more likely that the modulation will be moderate rather than totally going down to zero at some point (which is the RHS as it is shown above). The result is a triplet of peaks that are symmetrically placed in terms of frequency as I explain here:
http://climategrog.wordpress.com/2013/09/08/amplitude-modulation-triplets/
If the physical reality is the RHS , what you will find by frequency analysis is the LHS. It’s not wrong, it is mathematically identical. What can go wrong is in the conclusions that are drawn if this equivalence is not understood, and the fact the freq analysis cannot direct show amplitude varying signals is not realised.
Now if we look at figure 3 where the Armagh dataset is split into first and second halves, in the top panel (full) we see a small peak at what looks like 28 years, that is missing in the middle panel (first half).
In “full” there are two pairs of peaks either side of 28 that appear (as well as can be guessed from these graphs) to the modulation triplets I’ve described, both centred on 28y.
Also the side peaks in red in middle panel look like they overlay the the left and right pairs in “full”. This may be a due to lesser resolution from using an earlier period, or just sampling a different part of the modulation series.
The red peaks appear to be 23 and 36, by eqn 1 that gives a modulation period of 120y that will produce “beat” envelop of 60 years. None of which will show directly in the freq analysis unless there is strong non-linearity or some secondary effects in climate causing an additional signal.
Now looking at the triplets in “full” and calculating the average _freq_ of the outer peaks:
21 and 41 => 27.7y 86y
24 and 34 => 28y 163y
Symmetrical as I suggested and centred on the same value: that of the centre peak. Again using eqn 1 we find the modulation periods causing these triplets to be 86y and 163y. A higher resolution spectrum would give a better indication of the symmetry (which is an important safeguard against seeing spurious triplets everywhere).
in the lower panel (2nd half) we see again 21 and 28 and the suggestion of something that could be third peak seen in “full” that is not resolved from a strong rise at the end.
So instead of throwing up your hands and saying “go figure” like it is some kind of result maybe you need to “go figure” what is in the spectrum and how to interpret it.
I will try to take a closer, higher resolution look at the Armagh record later , it looks interesting.
And the Berkeley Earth and HadCRUT4 data are global averages.
According to your Fig.1, Berkeley Earth has global average temperatures starting from 1752… Sorry, but suspension of disbelief goes only so far.
I’m not sure about this process. I have ‘played’ with ‘sound’ and fourier transform. Try 3 minutes of music (continuous) and then try the same 3 minutes with 10 seconds chopped out of every minute. Compare your results then get back to us. I may be wrong.
Interesting read. I too like to start from scratch (first principles) when deriving methods and “data fits”. There are advantages to that path. Like you, I’m “tickled” when my approach turns out to be one already in use by folks “who should know”. At rare times, I’ve proved the “accepted method” was inapplicable to a specific problem (usually due to incorrect assumptions “hidden” in the “accepted method”). In those rare cases, my “thinking” was seriously challenged by “entrenched professionals”, at first. FWIW to someone who seems to derive similar pleasure from a similar activity.
Ah, Ive just noticed that both CET and Armagh have a peak about 85y ( hard to be more precise on this scale. This abviously ties in with the modulation triplets I noted and suggests some physically separate variation or strong non linear effect is present.
anyway it seems to corroborate the ( 21 28, 41 ) triplet as being correctly interpreted as 27.7y modulated by 86y.
RS ” If the sampling is irregular, you will NOT calculate the the true spectrum of the signal.”
That is strictly true. However, as long as this is recognised, I think it’s a useful means of bridging gaps in the data. Some awareness of the distortion caused and not doing it over very broken data would seem prudent but I think it’s a useful alternative to zero padded FFT and similar techniques that have strict continuity requirements.
A SFT of the NAO would be interesting.
Willis. More nice work and you get people thinking. A few comments/thoughts:
1. Nick Stokes’ comment is well worth reading.
2. FFTs are limited to values of 2^n (16, 32, 64, 128 …).
3. People often pad the time series with zeros to get finer detail in the frequency domain.
4. While regular sampling is implied in the FFT you could just jam irregular data through it as it only takes the time series as input and not the sample times. The result would not be reliable.
5. FFTs are not approximations. They are an exact band-limited representation of a time series. This is why you can transform to frequency and back to time and recover the original data set.
6. FFT frequency profiles are normally shown as a power spectrum from low to high frequency (the reverse of your plots).
7. Gaps in a time series are easily interpolated with little effect on the frequency content. Large gaps can be worked around by splitting the time series at the gap and applying an FFT separately to both sections.
8. Units are “real” and “preserved” in a FFT. Time becomes frequency, for example (secs > Hz or 1/secs). Upon the reverse FFT you always go back to your original units.
9. The early Berkley dataset shows well-behaved higher frequency content (Figure 5) that is not present in the other datasets. This makes me wonder if it is more heavily filtered, or outliers removed.
10. If I was researching the frequency content of a time series and obtained power spectra similar to your periodicity diagrams I would cry. Based on the large variation in frequency content one can only conclude that many of the climate “cycles” we read about are not periodic in the temperature data, or fall well below natural variation. This is probably what makes climate scientists so “valuable”.