This DSP engineer is often tasked with extracting spurious signals from noisy data. He submits this interesting result of applying these techniques to the HadCRUT temperature anomaly data. Digital Signal Processing analysis suggests cooling ahead in the immediate future with no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113. See figures 13 and 14. Code and data is made available for replication. – Anthony
Guest essay by Jeffery S. Patterson, DSP Design Architect, Agilent Technologies
Harmonic Decomposition of the Modern Temperature Anomaly Record
Abstract: The observed temperature anomaly since 1900 can be well modeled with a simple harmonic decomposition of the temperature record based on a fundamental period of 170.7 years. The goodness-of-fit of the resulting model significantly exceeds the expected fit to a stochastic AR sequence matching the general characteristic of the modern temperature record.
Data
I’ve used the monthly Hadcrut3 temperature anomaly data available from http://woodfortrees.org/data/hadcrut3vgl/every as plotted in Figure 1.
Figure 1 – Hadcrut3 Temperature Record 1850-Present
To remove seasonal variations while avoiding spectral smearing and aliasing effects, the data was box-car averaged over a 12-month period and decimated by 12 to obtain the average annual temperature plotted in Figure 2.
Figure 2 – Monthly data decimated to yearly average
A Power Spectral Density (PSD) plot of the decimated data reveals harmonically related spectral peaks.
Figure 3 – PSD of annual temperature anomaly in dB
To eliminate the possibility that these are FFT (Fast Fourier Transform) artifacts while avoiding the spectral leakage associated with data windowing, we use a technique is called record periodization. The data is regressed about a line connecting the record endpoints, dropping the last point in the resulting residual. This process eliminates the endpoint discontinuity while preserving the position of the spectral peaks (although it does extenuate the amplitudes at higher frequencies and modifies the phase of the spectral components). The PSD of the residual is plotted in Figure 4.
Figure 4 – PSD of the periodized record
Since the spectral peaking is still present we conclude these are not record-length artifacts. The peaks are harmonically related, with odd harmonics dominating until the eighth. Since spectral resolution increases with frequency, we use the eighth harmonic of the periodized PSD to estimate the fundamental. The following Mathematica (Mma) code finds the 5th peak (8th harmonic) and estimates the fundamental.
wpkY1=Abs[ArgMax[{psdY,w>.25},w]]/8
0.036811
The units are radian frequency across the Nyquist band, mapped to ±p (the plots are zoomed to 0 < w < 1 to show the area of interest). To convert to years, invert wpkY1 and multiply by 2p, which yields a fundamental period of 170.7 years.
From inspection of the PSD we form the harmonic model (note all of the radian frequencies are harmonically related to the fundamental):
(*Define the 5th order harmonic model used in curve fit*)
model=AY1*Sin[wpkY1 t+phiY1]+AY2*Sin[2*wpkY1* t+phiY2]+AY3*
Sin[3*wpkY1* t+phiY3]+AY4*Sin[4*wpkY1* t+phiY4]+AY5*
Sin[5*wpkY1* t+phiY5]];
vars= {AY1,phiY1,AY2,phiY2,AY3,phiY3,AY4,phiY4,AY5,phiY5 }
and fit the model to the original (unperiodized) data to find the unknown amplitudes, AYx, and phases, phiYx.
fitParms1=FindFit[yearly,model,vars,t]
fit1=Table[model/.fitParms1,{t,0,112}];
residualY1= yearly- fit1;{AY1→-0.328464,phiY1→1.44861,AY2→-
0.194251,phiY2→3.03246,AY3→0.132514,phiY3→2.26587,AY4→0.0624932,
phiY4→-3.42662,AY5→-0.0116186,phiY5→-1.36245,AY8→0.0563983,phiY8→
1.97142,wpkY1→0.036811}
The fit is shown in Figure 5 and the residual error in Figure 6.
Figure 5 – Harmonic model fit to annual data
Figure 6 – Residual Error Figure 7 – PSD of the residual error
The residual is nearly white, as evidenced by Figure 7, justifying use of the Hodric-Prescott filter on the decimated data. This filter is designed to separate cyclical, non-stationary components from data. Figure 8 shows an excellent fit with a smoothing factor of 15.
Figure 8 – Model vs. HP Filtered data (smoothing factor=3)
Stochastic Analysis
The objection that this is simple curve fitting can be rightly raised. After all, harmonic decomposition is a highly constrained form of Fourier analysis, which is itself a curve fitting exercise that yields the harmonic coefficients (where the fundamental is the sample rate) which recreate the sequence exactly in the sample domain. That does not mean however, that any periodicity found by Fourier analysis (or by implication, harmonic decomposition) are not present in the record. Nor, as will be shown below, is it true that harmonic decomposition on an arbitrary sequence would be expected to yield the goodness-of-fit achieved here.
The 113 sample record examined above is not long enough to attribute statistical significance to the fundamental 170.7 year period, although others have found significance in the 57-year (here 56.9 year) third harmonic. We can however, estimate the probability that the results are a statistical fluke.
To do so, we use the data record to estimate an AR process.
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearlyTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
{a1→0.713,a2→0.0647,a3→0.0629,a4→0.181,a5→0.0845,v→0.0124391}
As can be seen in Figure 9 below, the process estimate yields a reasonable match to observed power spectral density and covariance function.
Figure 9 – PSD of estimated AR process (red) vs. data Figure 9b – Correlation function (model in blue)
Figure 10 – 500 trial spaghetti plot Figure 10b – Three paths chosen at random
As shown in 10b, the AR process produces sequences which in general character match the temperature record. Next we perform a fifth-order harmonic decomposition on all 500 paths, taking the variance of the residual as a goodness-of-fit metric. Of the 500 trials, harmonic decomposition failed to converge 74 times, meaning that no periodicity could be found which reduced the variance of the residual (this alone disproves the hypothesis that any arbitrary AR sequences can be decomposed). To these failed trials we assigned the variance of the original sequence. The scattergram of results are plotted in Figure 11 along with a dashed line representing the variance of the model residual found above.
Figure 11 – Variance of residual; fifth order HC (Harmonic Coefficients), residual 5HC on climate record shown in red
We see that the fifth-order fit to the actual climate record produces an unusually good result. Of the 500 trials, 99.4% resulted in residual variance exceeding that achieved on the actual temperature data. Only 1.8% of the trials came within 10% and 5.2% within 20%. We can estimate the probability of achieving this result by chance by examining the cumulative distribution of the results plotted in Figure 12.
Figure 12 – CDF (Cumulative Distribution Function) of trial variances
The CDF estimates the probability of achieving these results by chance at ~8.1%.
Forecast
Even if we accept the premise of statistical significance, without knowledge of the underlying mechanism producing the periodicity, forecasting becomes a suspect endeavor. If for example, the harmonics are being generated by a stable non-linear climatic response to some celestial cycle, we would expect the model to have skill in forecasting future climate trends. On the other hand, if the periodicities are internally generated by the climate itself (e.g. feedback involving transport delays), we would expect both the fundamental frequency and importantly, the phase of the harmonics to evolve with time making accurate forecasts impossible.
Nevertheless, having come thus far, who could resist a peek into the future?
We assume the periodicity is externally forced and the climate response remains constant. We are interested in modeling the remaining variance so we fit a stochastic model to the residual. Empirically, we found that again, a 5th order AR (autoregressive) process matches the residual well.
tDataY=TemporalData[residualY1-Mean[residualY1],Automatic];
yearTD=TemporalData[residualY1,{ DateRange[{1900},{2012},"Year"]}]
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
A 100-path, 100-year run combining the paths of the AR model with the harmonic model derived above is shown in Figure 13.
Figure 13 – Projected global mean temperature anomaly (centered 1950-1965 mean)
Figure 14 – Survivability at 10 (Purple), 25 (Orange), 50 (Red), 75 (Blue) and 100 (Green) years
The survivability plots predict no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113.
Discussion
With a roughly one-in-twelve chance that the model obtained above is the manifestation of a statistical fluke, these results are not definitive. They do however show that a reasonable hypothesis for the observed record can be established independent of any significant contribution from greenhouse gases or other anthropogenic effects.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Stephen Rasey says:
September 14, 2013 at 5:12 pm
I take that sentence to read the CO2 doubling will take hundreds of years to occur. Is that in line with their current projections?
I agree with “again the pause shows that CO2 sensitivity is tiny to insignificant with other significant factors unrecognized.” I recently posted a follow up to the SSA analysis I posted earlier (http://montpeliermonologs.wordpress.com/2013/09/14/ssa-of-the-hardcrut3-record-part-2/) which examines the first two SSA eigenmodes. It shows the trend has been losing steam since 1965, while CO2 rose exponentially. The residual from just first two modes is trendless and noise-like. Guess the AWG is all in the ocean, just below the Argo buoy’s maximum depth. Rats, and after we spent all that money deploying them. AWG is weally a Wascally wabbit
Jeff Patterson:
Active man-made systems often incorporate elements, such as op amps, transducers, various modulators and true feedback loops, that can produce all sorts of harmonics when driven by a sinusoidal input. Those elements, however, are simply not found in the inanimate natural world. In passive natural thermodynamic systems, whatever harmonics one finds are usually the product of direct forcing, That is certainly the case with the diurnal temperature cyle that is driven by the cosine-arch of insolation, with zero values at night.
Although it’s not clear what sort of “transport delay” you had in mind, the workings of entropy–which always disperse thermal energy rather than concentrate it–militate against anything similar to the mechanical resonance found in some second-order systems.
If you ignore physics completely and concentrate solely on DSP algorithms applied to woefully short records, you can find all sorts of “cycles” that have no embodyment in reality. Without any frequency decimation, the raw periodogram that you present in your post is not even a consistent estimator of the PSD. And you’ll find that that the results of SSA depend quite critically upon some rather abitrarily chosen analysis paramters The case for strictly periodic comnponents in the “climate” signal reamis flimsy at best.
@1sky1
Thank you for your thoughtful reply. The transports I had in mind were things like the delay in ocean heat transport to the poles, the delay in albedo changes from rising temps, delays in the carbon cycle etc. It seems to me that to the extent the output of any delayed mechanism contributes to the forcing, it seems to me it could result in frequency selective peaking in the climate response, even if passive.
You said “In passive natural thermodynamic systems, whatever harmonics one finds are usually the product of direct forcing”. Does this lend credence to the notion that the harmonic relationship seen above in in the SSA analysis are due to external forcing or are you of the opinion that they are a statistical fluke?
I’d be very interested in your thoughts about the analysis outlined here. http://montpeliermonologs.wordpress.com/2013/09/16/global-warming-a-la-mode/
It makes no assumption about underlying physical mechanisms nor whether they are persistent or transitory. It does use SSA to decompose the climate record but its conclusion (I maintain) is independent of the means of decomposition.
All the best
@1sky1 9/16 3:12 pm
Let me start off by saying that I do not believe in a linear system response of the earth that is over a decade long. I recognize the possibility, but I do not believe it is real. Hiding decades of climate forcings in 0.04 deg C warming of the deep oceans is theater not thermodynamics.
I do believe that the earth has a profound NON-linear systems response that can run decades, or centuries in duration. It explains causes of ice ages, of warm periods, and of hot houses. It is chaos as seen in the geologic record. This hypothesis has the ramification that the sign of a change in “forcing” is not mathematically symmetrical as we would expect in a linear system.
It is quite possible that a +1 W/m^2 (If I choose to get trapped in a questionable paradigm) has a short system response, where as a -1 W/m^2 has a much longer response.
As you point out, 1sky1, it means little to talk of response functions until you can point to physical processes that can animate these responses. Two of the processes are snow and Life. The positive feedback of increasing snow begetting colder temperatures begetting more snow leading to ice ages is accepted. Likewise, the retreat of ice sheets causes a positive feedback of warming that is also well accepted. Less accepted are the mechanisms for the turn — what flips to start the warming or cooling if it isn’t astronomical (orbits, inclination, impacts, or solar super cycle) or volcanic?
Less accepted, I think, is the non-linearity of Life. Life it seems is a powerful negative feedback mechanism on this plant. It is the core of the Gaia Hypothesis. When one thinks of a typical feedback look, we often model it as if it were a linear system in a known range: if a +x change in input yields a +y output, then a -x change in input yields a -y change in output.
The non-linearity comes into play when we realize that it is much easier to destroy than to create. A bountiful spring and summer can result in much seed for the next year. This would be a quick cycle feedback. Conversely, a late spring, dry fiery summer, or early autumn can harm the life cycle causing many years of recovery – a lengthy system response. Timberlines should fall faster than climb. Tundra lines should move equator-ward faster than pole-ward. Forests are felled or burned faster than they can regrow. Reef’s are bleached faster than they repopulate. Plagues wipe out populations taking generations to grow back, if ever.
The mathematics to model such a world are difficult – provided you don’t worry about accuracy. It makes for interesting science fiction, but intractable science. Fourier Analysis is pointless here. I argue that it is impossible to realistically model such a world with any sort of consensus. There are too many non-linear, non-symmetric system responses to nail down and agree upon; too many degrees of freedom. In such a system, the very concept of an Equilibrium Climate sensitivity to [CO2] has little meaning; a linear system concept made irrelevant by non-linear system dynamics.
Jeff Patterson:
Regrettably, demands on my time force my responses to be very brief.
The delays that you mention essentially constitute either a straightforward spatial redistribution of thermal energy around the globe or, in the case of albedo changes, an adaptive change in the forcing and/or response characteristics of the global system. Certainly clouds are the gatekeepers of insolation capable of modulating the nearly constant TSI seen at TOA. Unfortunately, we know precious little about the their long-term variability on a global scale. Nevertheless, being a product of chaotic convective development, it’s highly unlikely that such modulation would be a strictly periodic function of any kind.
I actually suspect that the “harmonic periodicities” found in Hadcrut3 are an artifact of the not-very-transparent construction of that index, which relies greatly upon manufacturing time-series from highly scattered observations by ships of opportunity. It differs very materially in that regard from any genuine long-term temperature time-series that I’ve ever analyzed over decades of work with such series. Lacking any physical basis for a strictly periodic driver, one should look for stochastic forcing of appropriate bandwidth instead. After all, it is signal bandwidth that determines its predictability.
Stephen Rasey:
Pressed for time, I’m going to mull over your philosophical remarks before replying tomorrow.
Steven Rasey:
Having worked extensively on dynamical-system problems in a geophysical setting throughout the mature portion of my career, I don’t rely upon Fourier analysis blindly. The principal nonlinearities in the climate system, beyond the fourth-root of temperature dependence upon energy, are found in the Navier-Stokes equations governing convective and advective transport. Important as it is, that transport doesn’t directly affect the thermalization of insolation on a global yearly-average basis. Thus much is to be learned about year-to year variations from a linear-system standpoint. It turns out that, at stations where long, vetted records are available, nothing coheres as consistently with temperature throughout the entire spectral baseband than total sunshine hours experienced. Bi-spectrum analysis seldom shows any indication of strong quadratic interaction within such records. The imprint of solar forcing is clearly evident even in locations, such as Ross-on-Wye, where advective effects dominate during overcast winters.
The nonlinearities that you point to are, of course, real. But for many practical purposes, they can be effectively linearized, quite irrespective of their time scale. What distrust I have of multidecadal and longer components manifest in many manufactured indices stems from the fact that they have been heavily corrupted by one means or another. Nor do the “feedbacks” that you mention constitute a true looping back of system output into system input; they are more akin to changes in response characteristics, which are ordinarily assumed to be time-invariant. This indeed makes their mathematical modeling much less easily tractable, but by no means impossible. And in such modeling Fourier analysis (though, of course, not the simplistic periodogram) continues to play a vital role.
Finally, let’s not forget that solar energy ultimately powers not only thermal processes, but mechanical ones (winds) and biological ones (plant growth) as well. As seen in the Milankovitch-cycle explanation of the ice ages, it is variations in solar forcing, rather than nonlinear internal responses, that dominate the record.
Jeff Patterson:
A brief addendum, based upon reviewing the results of SSA that you linked to:
It would be instructive to break Hadcrut3 into three disjoint 54-year segments and do identical SSAs on each. Such an exercise might dissuade you from the view that such analysis (which is essentially an empirical orthogonal function decomposition based on the Karhunen-Loeve expansion) reveals in its lowest modes some quantitatively consistent, physically meaningful “components” of the time-series, “independent of the means of decomposition.”
@1sky1
It turns out that, at stations where long, vetted records are available, nothing coheres as consistently with temperature throughout the entire spectral baseband than total sunshine hours experienced.
I’ll by that. But it begs the question of what drives total sunshine hours?
(assuming that is the driver of temperature and not the other way around)
That’s more of a retorical question getting back to my non-linear life-based drivers.
The nonlinearities that you point to are, of course, real. But for many practical purposes, they can be effectively linearized, quite irrespective of their time scale.
Maybe. Yes, non-linear functions are commonly linearized within narrow domains to make the math easier. I was drawing attention to that common assumption that here may not be valid. How do you linearize an effect that has properties that resembles is a hysteresis? How do you linearize a process who’s system response is path dependent?
Let me give you an economic example. Back in 1982 at the SEG conference, I gave a panel paper that for argument’s sake began with an oscillating Oil Price curve from $35/bbl to $15/bbl as kind of a joke. I next put up the traditional Supply-Demand curve. “We are explorationists. We find tomorrow’s oil. We know today were there are a trillion bbls of oil. — It’s just a question of price to get it out. If we feel the price will be high enough, we build our Alaskan Pipeline, we develop our North Sea, we build our deep water platforms, we develop Athabasca. Working our way up the Supply price curve to greater supply. But then, demand adjusts to the higher prices, lowering the clearing price.
“So do we turn off the TAPS? Do we turn off North Sea? Do we shut in our platforms? Do we work our way down the SAME supply curve? WE DO NOT. We accept a lower price at the same supply rate. Instead of sliding back down that one long-term supply curve, we depart the curve, dropping to a steeper curve, with lower prices for the same previoous supply points.
“The Long Term Oil Supply curve is a hysteresis! Changes in prices are met by delayed restoring forces to supply. In any system that involves delayed restoring forces you get (first price projection slide) an oscillator! ” The slide that started out as a joke returns as a dead serious possibility. And it turned out I was being optimistic — the bottom was below $12/bbl about 5 years later.
So, linearizing a non-linear curve is not the key issue. It is recognizing that some systems performance is a path-dependent non-symmetric hysteresis. That is exactly the way much of Life works. Maybe… so does the Climate — at least that part of the Climate influenced by Life.
Well, Mr. Rasey, perhaps, you intuited something significant in your “hysteresis” supposition. If Lady Moon’s 18.6-year odysseys, gently but firmly shaping the ocean tides, are a key driver of ocean heat oscillations, her “hysterical” [from Greek] influence may, as others above posited, be a key mechanism.
Silly, huh? Well, I, the night janitor, had fun grabbing the chalk and writing that up on the board, here, in the dark university classroom, after everyone’s gone home.
And………… IT’S A FULL MOON. #(:)) Heh, heh, heh.
Even though you likely won’t read this, THANK YOU, Jeff Patterson, for all that first class (albeit obscure, sometimes, for a non-scientist) straightforward, pure science, honest analysis. I LOVE THIS SITE — one learns so much. And, thanks, too, Mr. Patterson, for persevering in answering all the (sometimes worthwhile, sometimes obtuse or off-base, and, SOMETIMES, just plain obnoxiously rude) challenges to your methods, above. I hope you are given the right to post graphics — YOU certainly should!
Stuffed with undefined argot. Who are you writing for?
Jeff, I’m still waiting for your explanation of my falsification of your claim that
I said that wasn’t true, and provided the following graphic showing two distinct sine waves going through the two points (0,0) and (1.3, 0.9). Here’s the graphic again:

Your first response was to claim I’d used three points, and I pointed out that no, only two points were specified, and they fit your criteria.
At that point, you disappeared from that topic and went on to discuss a host of other things … interesting things to be sure.
But when a man does that, makes a claim and then runs away when people object (I wasn’t the only one), I fear that their credibility goes down the tubes.
Now, you certainly appear to have fooled some of the people some of the time. For example, Janice gushes:
Sorry, Janice, he didn’t persevere in the slightest. He ran away from that question and never returned. In addition, he did only the most simplistic type of out-of-sample analysis, he claims he can determine a cycle of a most precise length (170.7 years) from 110 years of data, he did not do anything resembling a comprehensive Monte Carlo analysis. His response to my pointing out that the climate is chaotic was also most puzzling …
Overall, I’d give his work about a C+, maybe a B-, but more for effort than results. However, his running away from from my falsification of a very important claim, a claim that is the basis of his idea that you can detect a 170.7 year cycle in 110 years of data, takes it down to an F.
Still waiting …
w.
Let me also say that I strongly support the analysis of 1sky1, who says above:
What he said …
w.
Janice Moore:
The moon’s 18.6yr nodal precession cycle is a very far-fetched driver for any long-period thermal oscillations. To begin with, the tides cannot change either the total volume or effective area of water on our planet, nor its thermal capacity. Their amplitude is measured in centimeters in the deep ocean and the precession cycle doesn’t even make the top 20 of tidal constituents. Furthermore, while hysteresis alters the wave form by introducing harmonics, it doesn’t produce sub-harmonics. What multi-decadal and longer oscillations are evident in SST data are much more likely the result of little-explored modulations of surface insolation by clouds, i.e. induced oscillations in the hydrological cycle.
Steven Rasey:
Instantaneous (memory-less) nonlinearities are stochastically linearized by so-called “describing function” methods, which allow a spectral viewpoint to be maintained in analysis, albeit without the assurance of independence of different frequencies. In other words, one can deal effectively with phase-locked components, which can be recognized via bi-spectrum analysis. When “memory” is involved–which capacitive systems always manifest–it’s an entirely different ball game. And with such systems subjected to stochastic excitation, you get closer to how the climate system appears to operate. Even if I had the free time (and the typing skills) to produce a mini-essay on that subject, by contractual terms I would not be permitted to do so here. Thus I must leave this discussion on a note that satisfies neither one of us. Your civilized discourse nevertheless is much appreciated.
Thank you, 1 Sky 1, for dignifying my feeble attempt at a post with such a thoughtful response.
Willis Eschenbach says: Still waiting …
September 20, 2013 at 5:04 am
Sorry for the delayed response – my day job’s been pressing in of late and I’ve been working on SSA.
Your plot demonstrates the lack of spectral resolution in a two point record. The two points can be reconstructed without distortion from their discrete fourier transform (because Nyquist is violated) but the two signals cannot be spectrally separated because there are only two spectral bins in the discrete frequency domain for a two point record. Thus the energy is split into a dc component and a component at the Nyquist frequency. In the sample domain, time only exists at the sample points. The discrete frequency domain corollary is that only frequencies which are harmonics Fs/(2*N), where Fs is the sample frequency and N is the record length can be contained within a single bin. All others will be split across two bins, which is why we use HD when appropriate, it greatly enhances spectral resolution as shown below.
Clearly a two-point record though was not in mind when I was trying (inartfully) to explain Nyquist sampling to someone not well versed in the subject using a simple geometric analogy. I apologize for my error. I should have stated that two points define a sine wave to within the frequency accuracy determined by the record length.
All of this has nothing at all to do with the question at hand, since we are not dealing with two point records, and I am not using fourier analysis above, except as a starting point for the fit routine. Which brings us to this ” he claims he can determine a cycle of a most precise length (170.7 years) from 110 years of data,”
Well let us see with a little experiment using my HD algorithm:
(* Create a 113 point record of a sine wave of period 170.705678 years sample at yearly intervals*)
sin = Table[Sin[n*2.*Pi/170.705678], {n, 0, 113}];
(*extract the frequency of the highest peak as bound for the fit routine *)
psd = PowerSpectralDensity[sin, w];
wmax = ArgMax[{psd, w > 0}, w];
(* do the curve fit, bounding the frequency between wmax found above +/- one bin *)
fit = FindFit[
sin, {a*Sin[wo*t + phi], wmax-1/113 < wo < wmax + 1/113, -Pi < phi NMinimize]
(* express the frequency to 10 decimal digits precision for Willis*)
N[(2*Pi/wo) /. fit, 10]
{a -> 1., wo -> 0.0368071, phi -> -0.0368054}
170.706
So the good folks at Wolfram extracted the correct frequency not to just to .1 Years but to .001 years.
Note it got the amplitude (a) exactly, to the decimal precision of MMa and the phase was off by just .037 radians (2 degrees).
That was fun, let’s try it again, adding in some bandwidth-limited noise of variance .03, the value extracted from the AR model fit.
sin = Table[Sin[n*2.*Pi/170.705678], {n, 0, 112}] +
RandomVariate[NormalDistribution[0, .03], 113];
psd = PowerSpectralDensity[sin, w];
wmax = ArgMax[{psd, w > 0}, w];
fit = FindFit[
sin, {a*Sin[wo*t + phi],
wmax – 1/113 < wo < wmax + 1/113, -Pi < phi 0}, {a, wo,
phi}, t, Method -> NMinimize]
N[(2*Pi/wo) /. fit, 10]
{a -> 1.00917, wo -> 0.0368329, phi -> -0.0459094}
170.586
So we’re off by 170.705678-170.586 = .12 years, sue me.
Note that in the algorithm above, the only use of the FFT was in the PSD routine whose only purpose is to locate the frequency within a bin. HD has much better spectral resolution (as shown here) which is why we use it.
“His response to my pointing out that the climate is chaotic was also most puzzling …”
“”Even Chaos can have an underlying harmonic structure (for example, fractals, Strange attractors, etc.)””
Perhaps you are puzzled more than you admit. That wasn’t my quote but Steven Groeneveld’s September 12, 2013 at 11:32 am
As far as the other aspersions on my character, I’ll just chalk those up to Willis being Willis.
All the best
JP
1sky1 says:
September 16, 2013 at 3:12 pm
If you ignore physics completely and concentrate solely on DSP algorithms applied to woefully short records, you can find all sorts of “cycles” that have no embodiment in reality.
I agree, but separating those that embody reality from those that do not is the key. The goodness-of-fit obtained above with a simple harmonic fit is certainly not definitive given the 1/12 probability of it just being a statistical fluke, but neither do I think it is as easily dismissed as you imply. As for being an artifact of the Hadcrut3 record, it is hard to see how a 60 year cycle could be introduced accidentally. The fit algorithm is highly insensitive to noise and single events as it minimizes LSE across the entire record. I would also note, SSA analysis of the SST record shows a similar periodicity (see http://montpeliermonologs.wordpress.com/2013/09/21/periodicity-in-the-sst-record/)
Regards,
Jeff
1sky1 says:
September 16, 2013 at 3:12 pm And you’ll find that that the results of SSA depend quite critically upon some rather abitrarily chosen analysis paramters
Actually the analysis I’m doing is quite insensitive to choice of L, the widow parameter, which only controls the fineness of the separation between the two buckets. The motivation for this approach is simple. We are trying to detect a small signal in noisy data. Information theory tells us that information bearing signals must be non-ergodic. SSA allows us to separate the ergodic mode from the non-ergodic modes without loss of information (the sum of the two buckets always equally the original data). Since the ergodic (noise) bucket uses all modes not in the non-ergodic (information bearing) bucket, the analysis depends only on L which in turn only affects the spectral bin width. Using the maximum available L gives us the finest split between the two buckets so that’s what I use, but the results are nearly [identical] until L gets below 20 or so.
[The “link above” (in the words analysis I’m doing”) fails. Mod]
[Link repaired. -w.]
Opps, muffed a closing italic tag above (could someone fix this?). Sorry. Also a typo: “SSA allows us to separate the ergodic mode” Should read modes.
Thanks for fixing my dangling tag. The link should have pointed to http://montpeliermonologs.wordpress.com/2013/09/16/global-warming-a-la-mode/
@1sky1 4:13 pm
When “memory” is involved–which capacitive systems always manifest–it’s an entirely different ball game. And with such systems subjected to stochastic excitation, you get closer to how the climate system appears to operate.
The climate, especially with life included, is a capacitive system, no doubt.
Your civilized discourse nevertheless is much appreciated.
Likewise. I look forward to reading from 1sky1 in future threads.
@Jeff Patterson 9:42 am
fit = FindFit[
sin, {a*Sin[wo*t + phi], wmax-1/113 < wo < wmax + 1/113, -Pi < phi NMinimize]
sin is your manufactured 113 sample signal with a 170 sample cycle. (BTW, using “sin” as a name for a vector variable for a Sin() function is a sin. Vsin, V1sin, would have been better.)
Wmax is the maximum frequency in the power spectrum, which will be the first, lowest frequency, 1/113 Hz (are the units of PowerSpectralDensity in Hz or radians per time sample ?)
What your FindFit() amounts to is:
Find the best (a=amplitude), (wo=frequency) and (phi=phase) for a single sine wave such that [(sin(t) – a*Sin(wo*t + phi)), sum over t ={1,113}] is minimized,
Search wo across the narrow range 0 to 2/113 Phi greater than -pi.
The FindFit you used assumes there is only one sin wave to fit, that you were interested in fewer than 2 cycles over the analysis, and that there is no time-linear bias component. i.e the linear components + b*t + c, are set to zero by their omission. I think your solution was a little over specified for a general problem.
Stephen Rasey says:
September 21, 2013 at 2:29 pm
the linear components + b*t + c, are set to zero by their omission. I think your solution was a little over specified for a general problem.
I don’t think so. Wmax is the maximum frequency found in the psd, not the first. We know it is accurate to within the spectral resolution of the the DFT used in the powers spectral density routine, hence we search over that bandwidth (+/- one bin). The data was generated de-trended so I didn’t include any bias terms. In the actual HD algorithm, the data is de-trended and the mean removed (actually we use a process called periodization which is slightly different and gives better results) but we also include a dc term and sometimes a slope term.
“Wmax is the maximum frequency found in the psd, not the first. ” It is also the first, as in first bin. Which means your search is including the zero point. I’m just going by your Mathmatica code above. If the wo=0.0368 (radians/sample) , the search range had to include values below 1/113 Hz = 2*Pi/113 radians/sample
Your Mathmatica code did not detrend. So it is not proof that you can get the same answer if you detrend as you say in the ‘HD’ case. You will not. As I showed in My Sept 12 9:44 am, “detrending” is a fiction. Applying a detrend to a non-integer number of cycles will contaminate the analysis with a false trend and alter the frequency.