This DSP engineer is often tasked with extracting spurious signals from noisy data. He submits this interesting result of applying these techniques to the HadCRUT temperature anomaly data. Digital Signal Processing analysis suggests cooling ahead in the immediate future with no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113. See figures 13 and 14. Code and data is made available for replication. – Anthony
Guest essay by Jeffery S. Patterson, DSP Design Architect, Agilent Technologies
Harmonic Decomposition of the Modern Temperature Anomaly Record
Abstract: The observed temperature anomaly since 1900 can be well modeled with a simple harmonic decomposition of the temperature record based on a fundamental period of 170.7 years. The goodness-of-fit of the resulting model significantly exceeds the expected fit to a stochastic AR sequence matching the general characteristic of the modern temperature record.
Data
I’ve used the monthly Hadcrut3 temperature anomaly data available from http://woodfortrees.org/data/hadcrut3vgl/every as plotted in Figure 1.
Figure 1 – Hadcrut3 Temperature Record 1850-Present
To remove seasonal variations while avoiding spectral smearing and aliasing effects, the data was box-car averaged over a 12-month period and decimated by 12 to obtain the average annual temperature plotted in Figure 2.
Figure 2 – Monthly data decimated to yearly average
A Power Spectral Density (PSD) plot of the decimated data reveals harmonically related spectral peaks.
Figure 3 – PSD of annual temperature anomaly in dB
To eliminate the possibility that these are FFT (Fast Fourier Transform) artifacts while avoiding the spectral leakage associated with data windowing, we use a technique is called record periodization. The data is regressed about a line connecting the record endpoints, dropping the last point in the resulting residual. This process eliminates the endpoint discontinuity while preserving the position of the spectral peaks (although it does extenuate the amplitudes at higher frequencies and modifies the phase of the spectral components). The PSD of the residual is plotted in Figure 4.
Figure 4 – PSD of the periodized record
Since the spectral peaking is still present we conclude these are not record-length artifacts. The peaks are harmonically related, with odd harmonics dominating until the eighth. Since spectral resolution increases with frequency, we use the eighth harmonic of the periodized PSD to estimate the fundamental. The following Mathematica (Mma) code finds the 5th peak (8th harmonic) and estimates the fundamental.
wpkY1=Abs[ArgMax[{psdY,w>.25},w]]/8
0.036811
The units are radian frequency across the Nyquist band, mapped to ±p (the plots are zoomed to 0 < w < 1 to show the area of interest). To convert to years, invert wpkY1 and multiply by 2p, which yields a fundamental period of 170.7 years.
From inspection of the PSD we form the harmonic model (note all of the radian frequencies are harmonically related to the fundamental):
(*Define the 5th order harmonic model used in curve fit*)
model=AY1*Sin[wpkY1 t+phiY1]+AY2*Sin[2*wpkY1* t+phiY2]+AY3*
Sin[3*wpkY1* t+phiY3]+AY4*Sin[4*wpkY1* t+phiY4]+AY5*
Sin[5*wpkY1* t+phiY5]];
vars= {AY1,phiY1,AY2,phiY2,AY3,phiY3,AY4,phiY4,AY5,phiY5 }
and fit the model to the original (unperiodized) data to find the unknown amplitudes, AYx, and phases, phiYx.
fitParms1=FindFit[yearly,model,vars,t]
fit1=Table[model/.fitParms1,{t,0,112}];
residualY1= yearly- fit1;{AY1→-0.328464,phiY1→1.44861,AY2→-
0.194251,phiY2→3.03246,AY3→0.132514,phiY3→2.26587,AY4→0.0624932,
phiY4→-3.42662,AY5→-0.0116186,phiY5→-1.36245,AY8→0.0563983,phiY8→
1.97142,wpkY1→0.036811}
The fit is shown in Figure 5 and the residual error in Figure 6.
Figure 5 – Harmonic model fit to annual data
Figure 6 – Residual Error Figure 7 – PSD of the residual error
The residual is nearly white, as evidenced by Figure 7, justifying use of the Hodric-Prescott filter on the decimated data. This filter is designed to separate cyclical, non-stationary components from data. Figure 8 shows an excellent fit with a smoothing factor of 15.
Figure 8 – Model vs. HP Filtered data (smoothing factor=3)
Stochastic Analysis
The objection that this is simple curve fitting can be rightly raised. After all, harmonic decomposition is a highly constrained form of Fourier analysis, which is itself a curve fitting exercise that yields the harmonic coefficients (where the fundamental is the sample rate) which recreate the sequence exactly in the sample domain. That does not mean however, that any periodicity found by Fourier analysis (or by implication, harmonic decomposition) are not present in the record. Nor, as will be shown below, is it true that harmonic decomposition on an arbitrary sequence would be expected to yield the goodness-of-fit achieved here.
The 113 sample record examined above is not long enough to attribute statistical significance to the fundamental 170.7 year period, although others have found significance in the 57-year (here 56.9 year) third harmonic. We can however, estimate the probability that the results are a statistical fluke.
To do so, we use the data record to estimate an AR process.
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearlyTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
{a1→0.713,a2→0.0647,a3→0.0629,a4→0.181,a5→0.0845,v→0.0124391}
As can be seen in Figure 9 below, the process estimate yields a reasonable match to observed power spectral density and covariance function.
Figure 9 – PSD of estimated AR process (red) vs. data Figure 9b – Correlation function (model in blue)
Figure 10 – 500 trial spaghetti plot Figure 10b – Three paths chosen at random
As shown in 10b, the AR process produces sequences which in general character match the temperature record. Next we perform a fifth-order harmonic decomposition on all 500 paths, taking the variance of the residual as a goodness-of-fit metric. Of the 500 trials, harmonic decomposition failed to converge 74 times, meaning that no periodicity could be found which reduced the variance of the residual (this alone disproves the hypothesis that any arbitrary AR sequences can be decomposed). To these failed trials we assigned the variance of the original sequence. The scattergram of results are plotted in Figure 11 along with a dashed line representing the variance of the model residual found above.
Figure 11 – Variance of residual; fifth order HC (Harmonic Coefficients), residual 5HC on climate record shown in red
We see that the fifth-order fit to the actual climate record produces an unusually good result. Of the 500 trials, 99.4% resulted in residual variance exceeding that achieved on the actual temperature data. Only 1.8% of the trials came within 10% and 5.2% within 20%. We can estimate the probability of achieving this result by chance by examining the cumulative distribution of the results plotted in Figure 12.
Figure 12 – CDF (Cumulative Distribution Function) of trial variances
The CDF estimates the probability of achieving these results by chance at ~8.1%.
Forecast
Even if we accept the premise of statistical significance, without knowledge of the underlying mechanism producing the periodicity, forecasting becomes a suspect endeavor. If for example, the harmonics are being generated by a stable non-linear climatic response to some celestial cycle, we would expect the model to have skill in forecasting future climate trends. On the other hand, if the periodicities are internally generated by the climate itself (e.g. feedback involving transport delays), we would expect both the fundamental frequency and importantly, the phase of the harmonics to evolve with time making accurate forecasts impossible.
Nevertheless, having come thus far, who could resist a peek into the future?
We assume the periodicity is externally forced and the climate response remains constant. We are interested in modeling the remaining variance so we fit a stochastic model to the residual. Empirically, we found that again, a 5th order AR (autoregressive) process matches the residual well.
tDataY=TemporalData[residualY1-Mean[residualY1],Automatic];
yearTD=TemporalData[residualY1,{ DateRange[{1900},{2012},"Year"]}]
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
A 100-path, 100-year run combining the paths of the AR model with the harmonic model derived above is shown in Figure 13.
Figure 13 – Projected global mean temperature anomaly (centered 1950-1965 mean)
Figure 14 – Survivability at 10 (Purple), 25 (Orange), 50 (Red), 75 (Blue) and 100 (Green) years
The survivability plots predict no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113.
Discussion
With a roughly one-in-twelve chance that the model obtained above is the manifestation of a statistical fluke, these results are not definitive. They do however show that a reasonable hypothesis for the observed record can be established independent of any significant contribution from greenhouse gases or other anthropogenic effects.
“””””…….Ric Werme says:
September 11, 2013 at 5:43 am
A few comments:
1) decimated by 12
I know that “decimate” came from the Roman practice of inspiring soldiers after battles where the commanders felt were uninspired. They lined up the soldiers and killed every tenth man, leaving the remaining 90% inspired to win the next battle. At first blush, this sound like torturing the data until it confesses or perhaps this is “dodecimated” – killing every 12th measurement. :-)…….””””””
Ric, In this instance “decimated is a highly specific technical term from digital signal processing; so don’t go looking for a highway lines with occupied crosses.
And no adherence to the ten implication is required.
And I know about enough about it to be a hazard to myself; but if an Agilent Cowboy, doesn’t know Digital Signal Processing, then nobody does.
My concern with the exercise, is that I wonder what result one could pull out of the phone numbers in the Manhattan telephone directory.
It’s not clear to me that the Hadcrud numbers actually comprise data that relates to planet earth.
And having seen in the past, a mathematical exercise, that recovered the value of one of the fundamental physical constants of science (the fine structure constant), to about 8 significant digits of precision (about 1/3 of the standard deviation of the best experimental value, at that time); yet the model had precisely zero input from the physical universe; it was literally just fetzing around with numbers; so I don’t go for all new things, as an early adopter. I’m often the last to see the light.
I have quite recently seen a movie called “Gone with the Wind”, for the very first time.
Stephen Rasey says:
September 11, 2013 at 12:04 pm
@Jeff Patterson at 11:31 am
Thus we can reconstruct the _exact_ spectrum of the unperiodized record by subtracting out the known transform of the [known] line (ramp),
“You don’t know the line. 113 points in the dataset is just an artifact of how long someone was recording data. When the data starts and ends is similarly an artifact. Different starts and different ends give different lines and therefore different transforms. I expect differences in the high frequencies and power at the low frequencies.”
You know the line exactly because you create it! More precisely, it is a periodic sawtooth that returns to zero foreach record. You know its slope, amplitude and period and thus it’s Fourier (not DFT) transform exactly. The (complex) coefficients of this transform can be evaluated at the DFT bins subtracted from the periodized DFT to get the spectrum of the unperiodized record. Do the math, try and experiment or two. It works (and we’ve been shipping it in products for years!)
I should say that I enjoyed the post. Why start the series at year = 1900?
The following is well-said: If for example, the harmonics are being generated by a stable non-linear climatic response to some celestial cycle, we would expect the model to have skill in forecasting future climate trends. On the other hand, if the periodicities are internally generated by the climate itself (e.g. feedback involving transport delays), we would expect both the fundamental frequency and importantly, the phase of the harmonics to evolve with time making accurate forecasts impossible.
The author estimates 10 parameters, after estimating the fundamental period. Why not just fit a 10th order polynomial or a Bayesian Adaptive Regression Spline model, or some well-selected set of 10 wavelets — as long as you are acknowledging that this is merely another curve-fitting exercise with no claim to predictive power, any old and new functions are just as good. If the fundamental frequency is evolving with time, in what sense is it a “fundamental” frequency.
oh nuts. I omitted the “?” on my last sentence.
The harmonic technique used is useful in freeing the process of reproducing a ramp determined by the length of the dataset.
The first second and third harmonics are just recreating the longer ramp. The most notable is perhaps the strong 8th.
170/8= 21.25 years. Also one of the three periods my 3 cosine analysis latched on to.
The suggestion of Schwabe solar cycle is obvious.
mpainter says:
September 11, 2013 at 9:45 am
It has been said that if it is good science, it does not require a statistical massage.
Yeah, usually by non-scientists. It is not science if there is no measurement of phenomena (signal). There is no measurement without error (noise). There is no way to sort out the signal from the noise unless you have an insight into, and a command of, the mathematics describing the noise statistics. Try it sometime. It ain’t so easy as you might imagine.
Pamela Gray: Comment 1: While data massaging here is extreme, it does demonstrate the futility in ordinary least squares statistics so adored by AGWing scientists.
This is non-linear least squares, after finding the fundamental period.
“I have quite recently seen a movie called “Gone with the Wind”, for the very first time.”
Yeah, I should find to time to watch too, one day. 😉
Jeff Patterson: As the stochastic fitting exercise showed, JvN could only only do this .6% of the time if all he had to work with was harmonically related sine waves.
That’s smart.
Greg Goodman says: That is precisely where the problem lies. It _had been_ heavily low-pass filtered and the evidence is that it was not correctly anti-aliased before that was done.
I looked over the referenced paper and while I certainly agree that aliased data is irretrievably corrupted, the question is by how much? The fundamental of the extracted model has a peak-to-peak amplitude of .6 degrees. If this were an alias (remembering that the amplitude of the alias is equal to the high frequency periodicity being aliased) surely someone would have detected the .6 degree high frequency signal in the data. It would only require a few weeks of hourly data to find it!
Greg Goodman says:
September 11, 2013 at 12:01 pm
JP: ” forget to add a comment on forecasting: In retrospect I wish I had left this section out”
Yep, forgetting the title of the post may be the best approach.
I didn’t choose the post title which I agree was unfortunate.
See – owe to Rich says:
Goodman and Suamarez have many criticisms on this article, but no-one has raised the things which immediately troubled me. First, given that HadCRUT3 starts in 1850, why did JP throw away 50 years and start at 1900? This is not explained, and is fishy. Second, if the model is good at predicting forwards, how well does it do in predicting backwards (hindcasting) the said 1850-1899 period. Third, since the data length is 113 years, is not a harmonic of 56.9 years (almost half of the 113) extremely fishy?
170.7/3*2=113 = data length, oo-errr.
Well spotted.
“This DSP engineer is often tasked with extracting spurious signals from noisy data. He submits this interesting result of applying these techniques to the HadCRUT temperature anomaly data.”
-Anthony
Freudian slip?
Nick Stokes says: How can you establish a 170 year periodicity from 113 years of data?
Basically, all he did was estimate a nonlinear function from the set of parameterized (phase and amplitude) family of sine curves. Using that estimated period, he then chose harmonics of it as the additional terms in a multiple non-linear regression. In harmonic regression, which is what this is, there is no non-arbitrary way to select the fundamental frequency: the standard in Fourier analysis is to let the full data series represent 1 full period, which may not have any relevance if you have managed to sample, say, over 1.25 times the period.
This is off topic somewhat and is a criticism of computer run climate models. Picked this up in the comments section over at JoNova and think its hilarious,,
“Roy Hogue
September 12, 2013 at 5:42 am · Reply
“
…a foundation of bullshit coming from computers, not bovines.
Then let’s call it what it is — bitshit.
And you’re right about it without any doubt too. 🙂 “
@Jeff Patterson at 12:21
You know the line exactly because you create it!
YES!
YOU create it. Not Nature. It is an artifact of the measurement time window and processing.
That is why you must try different subsets and get different lines to explore the changes in the resulting transform. Then look for similarities that are more likely in the signal and not an artifact of the processing.
It’s like trying different stacking velocities in seismic processing to improve the signal-noise ratio.
rabbit: Have you considered Singular Spectrum Analysis (SSA) to extract the dominant modes of the temperature graph? This does not suffer from the windowing effects that a DFT suffers from, nor is it limited to a prescribed set of basis functions.
I second the question. After choosing the base frequency, why did you limit yourself to harmonics of the base frequency? Is there relevant physical theory that the important frequencies are multiples of the base frequency?
JP: “I looked over the referenced paper and while I certainly agree that aliased data is irretrievably corrupted, the question is by how much? The fundamental of the extracted model has a peak-to-peak amplitude of .6 degrees.”
No , I was not suggesting the whole long term trend is an alias. Neither did RC’s article on Climate etc suggest that degree of error. But if there is a lack of filtering it will not just be one frequency effected. RC demonstrated that this can lead to spurious decadal length trends (did you read the link?).
If there is error in the long term it is just as likely ‘bias correction’. I had the fortunate opportunity to discuss this at some length with John Kennedy in the comments. He was very forthcoming and helpful, though obviously defending their work.
What he told me of the validation process showed it was based on circular logic and geographically anecdotal “validation” from japanese data also using _buckets_. (Apparently japanese buckets don’t ‘bias’ like british and american ones!).
For the title , that’s editors for you . They’re always out for a catchy headline. 😉
PS the japanese paper was full of bias confirmation too. They only retained the data which suggested Hadley corrections went in the right direction. (The ‘validation’ was not any stronger then that). But all those going the other way were deemed “not suitable” and removed from consideration.
This is the problem with that state of the literature after 30 years of biased pseudo-science and political gate-keeping. The garbage is stacked several palettes high now.
Jeff, is there any sort of robustness tests for this method? Can the time series be analyzed on a shorter interval, and see if the predicted results for the out of sample period match?
I used to program DSPs and I can assure everyone they do not predict the future.
Correct me if I’m wrong, but any non-random data points will show harmonics. I’m not sure what AGW would show but I would think it would simply be a S/N ratio of an AGW harmonic on top of natural harmonics. In fact, mixing the two (natural and AGW) would result in seeing the AGW mix in even pre-AGW data as it would be unable to blend out. I think the assumption using DSP is that temperature is the convolution of linear functions and that may not be the case.
My gut tells me that DSP methods will treat random steps as non-harmonic noise and will not detect random and non-linear changes in DC offset that is characteristic of AGW. My gut is that a rise in temperature that is a sequence of step changes that are random in time and magnitude will be washed out in the deconvolution and the individual residual components that make up long term AGW would be below the noise level. Put another way, AGW may manifest itself as a sequence of below Nyquist rate events that all lie below the noise level. If the residual magnitudes look like “black body radiation chart” with a peak that is below the noise level, and a frequency longer than Nyquist, I would suspect you could have AGW in that signal and not be accounting for it as non-random and non-periodic and quite possibly attributing it to natural variation. The fact that you can model the past as a harmonic is from the math that will always solve, less a residual. It has no predictive value, though, because a driving component is non-linear addition to data. The math is good at figuring out how many “mixers and sources” are present in the existing data (or rather how many it needs to reach a certain S/N ratio) but not so good at predicting how many mixers and sources will be present in the future. I believe if it’s random in time and magnitude, and small at each step, no amount of oversampling and decimation will detect a fingerprint to predict a future value. For example, ENSO might be a large harmonic source that describes a long variation it temperature. If reality is that AGW is tiny, independent, sources and mixer to inject AGW at random points (i.e. say in 0.01C in June of 2007 and then 0.012C for January 2008, and then another 0.005C in September 2009), the analysis of those “below the nyquist and noise level will either leave them out or model them as a harmonic that has a lower residual error even if it’s really an accumulated error that will continue. The method will always add a harmonic term to reduce the error if it can whether it’s physical or not. I think the only way to overcome it is massive oversampling and decimation but you may still run into the noise floor of the data before you can pick out small, random, non-periodic residual signals. It’s like describing something as a finite series and every time your error gets too large, you add another term to the series that reduces the error. That’s what DSP algos will do and it’s not discriminate as to whether the term is physically descriptive of the underlying phenomena. It’s a fit and the number of terms is a function of the tolerable error and the technique. It works well to compare two different datasets but it’s not predictive and I’m not sure you can lump all the “fit” into natural variation and claim only the residual is a man-made artifact.
An interesting exercise would be to break the data up into ranges always starting at 0. 0-11 years, 0-23 years, 0-37 years, 0-61 years (maybe just prime intervals) see what the solution was at the end of each of those periods and see how those functions compare to the full dataset. The PSD harmonics should not change as long as the dataset is longer than the fundamental mode nyquist rate. If the PSD harmonics are different depending on dataset choices, I think there is a problem with the method and it’s assumptions.
Greg Goodman says: But if there is a lack of filtering it will not just be one frequency effected. RC demonstrated that this can lead to spurious decadal length trends (did you read the link?).
Color me dubious. The human body is about the closest thing I can think of to continuous time thermometer. Are we fooled into sensing a temperature trend that doesn’t exist due to aliasing? Of course not, because the earth’s temperature itself does not experience high frequency fluctuations. All the more so with a real thermometer because it couldn’t react to rapid changes even if they were there. If we then average equally spaced (in time) temperature readings, we are averaging already smoothed (by the slow reaction of the climate and the thermometer) data so again no aliasing occurs. Sure there are errors in the readings which may look like white noise, but these could not produce a measurable trend unless integrated to produce a random walk. What process are you proposing that would produce such a result?
Matthew R Marler says:
September 11, 2013 at 1:05 pm
After choosing the base frequency, why did you limit yourself to harmonics of the base frequency? Is there relevant physical theory that the important frequencies are multiples of the base frequency?
Because the PSD indicated a harmonic relation between the spectral peaks and because harmonics are generated anytime sinusoids drive a non-linear response (thermodynamic equilibrium involves a quartic relationship to temperature if I remember correctly ) .
Why do you smooth and desample to annual averages? High frequency cutoff filtering is better done in de frequency domain, as you are doing an fft anyways.
I do not see any spectral peaks here of any distinctive character. Which surprises me, because i would expect to see something distinctive in the neighborhood of 0.016 years^-1 (60-65 year cycle evident in the data). Rounded peaks are generally artifacts of finite record length, and not representative of actual underlying processes.