This DSP engineer is often tasked with extracting spurious signals from noisy data. He submits this interesting result of applying these techniques to the HadCRUT temperature anomaly data. Digital Signal Processing analysis suggests cooling ahead in the immediate future with no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113. See figures 13 and 14. Code and data is made available for replication. – Anthony
Guest essay by Jeffery S. Patterson, DSP Design Architect, Agilent Technologies
Harmonic Decomposition of the Modern Temperature Anomaly Record
Abstract: The observed temperature anomaly since 1900 can be well modeled with a simple harmonic decomposition of the temperature record based on a fundamental period of 170.7 years. The goodness-of-fit of the resulting model significantly exceeds the expected fit to a stochastic AR sequence matching the general characteristic of the modern temperature record.
Data
I’ve used the monthly Hadcrut3 temperature anomaly data available from http://woodfortrees.org/data/hadcrut3vgl/every as plotted in Figure 1.
Figure 1 – Hadcrut3 Temperature Record 1850-Present
To remove seasonal variations while avoiding spectral smearing and aliasing effects, the data was box-car averaged over a 12-month period and decimated by 12 to obtain the average annual temperature plotted in Figure 2.
Figure 2 – Monthly data decimated to yearly average
A Power Spectral Density (PSD) plot of the decimated data reveals harmonically related spectral peaks.
Figure 3 – PSD of annual temperature anomaly in dB
To eliminate the possibility that these are FFT (Fast Fourier Transform) artifacts while avoiding the spectral leakage associated with data windowing, we use a technique is called record periodization. The data is regressed about a line connecting the record endpoints, dropping the last point in the resulting residual. This process eliminates the endpoint discontinuity while preserving the position of the spectral peaks (although it does extenuate the amplitudes at higher frequencies and modifies the phase of the spectral components). The PSD of the residual is plotted in Figure 4.
Figure 4 – PSD of the periodized record
Since the spectral peaking is still present we conclude these are not record-length artifacts. The peaks are harmonically related, with odd harmonics dominating until the eighth. Since spectral resolution increases with frequency, we use the eighth harmonic of the periodized PSD to estimate the fundamental. The following Mathematica (Mma) code finds the 5th peak (8th harmonic) and estimates the fundamental.
wpkY1=Abs[ArgMax[{psdY,w>.25},w]]/8
0.036811
The units are radian frequency across the Nyquist band, mapped to ±p (the plots are zoomed to 0 < w < 1 to show the area of interest). To convert to years, invert wpkY1 and multiply by 2p, which yields a fundamental period of 170.7 years.
From inspection of the PSD we form the harmonic model (note all of the radian frequencies are harmonically related to the fundamental):
(*Define the 5th order harmonic model used in curve fit*)
model=AY1*Sin[wpkY1 t+phiY1]+AY2*Sin[2*wpkY1* t+phiY2]+AY3*
Sin[3*wpkY1* t+phiY3]+AY4*Sin[4*wpkY1* t+phiY4]+AY5*
Sin[5*wpkY1* t+phiY5]];
vars= {AY1,phiY1,AY2,phiY2,AY3,phiY3,AY4,phiY4,AY5,phiY5 }
and fit the model to the original (unperiodized) data to find the unknown amplitudes, AYx, and phases, phiYx.
fitParms1=FindFit[yearly,model,vars,t]
fit1=Table[model/.fitParms1,{t,0,112}];
residualY1= yearly- fit1;{AY1→-0.328464,phiY1→1.44861,AY2→-
0.194251,phiY2→3.03246,AY3→0.132514,phiY3→2.26587,AY4→0.0624932,
phiY4→-3.42662,AY5→-0.0116186,phiY5→-1.36245,AY8→0.0563983,phiY8→
1.97142,wpkY1→0.036811}
The fit is shown in Figure 5 and the residual error in Figure 6.
Figure 5 – Harmonic model fit to annual data
Figure 6 – Residual Error Figure 7 – PSD of the residual error
The residual is nearly white, as evidenced by Figure 7, justifying use of the Hodric-Prescott filter on the decimated data. This filter is designed to separate cyclical, non-stationary components from data. Figure 8 shows an excellent fit with a smoothing factor of 15.
Figure 8 – Model vs. HP Filtered data (smoothing factor=3)
Stochastic Analysis
The objection that this is simple curve fitting can be rightly raised. After all, harmonic decomposition is a highly constrained form of Fourier analysis, which is itself a curve fitting exercise that yields the harmonic coefficients (where the fundamental is the sample rate) which recreate the sequence exactly in the sample domain. That does not mean however, that any periodicity found by Fourier analysis (or by implication, harmonic decomposition) are not present in the record. Nor, as will be shown below, is it true that harmonic decomposition on an arbitrary sequence would be expected to yield the goodness-of-fit achieved here.
The 113 sample record examined above is not long enough to attribute statistical significance to the fundamental 170.7 year period, although others have found significance in the 57-year (here 56.9 year) third harmonic. We can however, estimate the probability that the results are a statistical fluke.
To do so, we use the data record to estimate an AR process.
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearlyTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
{a1→0.713,a2→0.0647,a3→0.0629,a4→0.181,a5→0.0845,v→0.0124391}
As can be seen in Figure 9 below, the process estimate yields a reasonable match to observed power spectral density and covariance function.
Figure 9 – PSD of estimated AR process (red) vs. data Figure 9b – Correlation function (model in blue)
Figure 10 – 500 trial spaghetti plot Figure 10b – Three paths chosen at random
As shown in 10b, the AR process produces sequences which in general character match the temperature record. Next we perform a fifth-order harmonic decomposition on all 500 paths, taking the variance of the residual as a goodness-of-fit metric. Of the 500 trials, harmonic decomposition failed to converge 74 times, meaning that no periodicity could be found which reduced the variance of the residual (this alone disproves the hypothesis that any arbitrary AR sequences can be decomposed). To these failed trials we assigned the variance of the original sequence. The scattergram of results are plotted in Figure 11 along with a dashed line representing the variance of the model residual found above.
Figure 11 – Variance of residual; fifth order HC (Harmonic Coefficients), residual 5HC on climate record shown in red
We see that the fifth-order fit to the actual climate record produces an unusually good result. Of the 500 trials, 99.4% resulted in residual variance exceeding that achieved on the actual temperature data. Only 1.8% of the trials came within 10% and 5.2% within 20%. We can estimate the probability of achieving this result by chance by examining the cumulative distribution of the results plotted in Figure 12.
Figure 12 – CDF (Cumulative Distribution Function) of trial variances
The CDF estimates the probability of achieving these results by chance at ~8.1%.
Forecast
Even if we accept the premise of statistical significance, without knowledge of the underlying mechanism producing the periodicity, forecasting becomes a suspect endeavor. If for example, the harmonics are being generated by a stable non-linear climatic response to some celestial cycle, we would expect the model to have skill in forecasting future climate trends. On the other hand, if the periodicities are internally generated by the climate itself (e.g. feedback involving transport delays), we would expect both the fundamental frequency and importantly, the phase of the harmonics to evolve with time making accurate forecasts impossible.
Nevertheless, having come thus far, who could resist a peek into the future?
We assume the periodicity is externally forced and the climate response remains constant. We are interested in modeling the remaining variance so we fit a stochastic model to the residual. Empirically, we found that again, a 5th order AR (autoregressive) process matches the residual well.
tDataY=TemporalData[residualY1-Mean[residualY1],Automatic];
yearTD=TemporalData[residualY1,{ DateRange[{1900},{2012},"Year"]}]
procY=ARProcess[{a1,a2,a3,a4,a5},v];
procParamsY = FindProcessParameters[yearTD["States"],procY]
estProcY= procY /. procParamsY
WeakStationarity[estProcY]
A 100-path, 100-year run combining the paths of the AR model with the harmonic model derived above is shown in Figure 13.
Figure 13 – Projected global mean temperature anomaly (centered 1950-1965 mean)
Figure 14 – Survivability at 10 (Purple), 25 (Orange), 50 (Red), 75 (Blue) and 100 (Green) years
The survivability plots predict no significant probability of a positive anomaly exceeding .5°C between 2023 and 2113.
Discussion
With a roughly one-in-twelve chance that the model obtained above is the manifestation of a statistical fluke, these results are not definitive. They do however show that a reasonable hypothesis for the observed record can be established independent of any significant contribution from greenhouse gases or other anthropogenic effects.
Stephen Rasey says:
September 21, 2013 at 10:02 pm
As I showed in My Sept 12 9:44 am, “detrending” is a fiction.
You didn’t apply the compensation correctly (as in not at all). Periodization is a two step process.
1) Detrend abpout the line connecting the endpoints.
2) DFT the the data and the line
3) Extract the complex spectral infomation of interest
4) Correct with the corresponding complex bins from the line
However, in the HD referred to in the article it isn’t necessary as I have explained many times. The slight shift one gets from using the uncorrected results for the psd just effects the starting point of the fit routine (which BTW is using a global gradient search algorithm which avoids getting trapped in locall minima’s).
@Jeff Patterson at 7:40 am
My problem is that the detrend line between end points has little to do with any real trend in the data. it is an accident of the sample period. Whether there is a real trend in the data or not is something that connot be determined uniquely. Detrending changes the wavelengths of sub-cycle components.
Now, if you want to say that your method searches for a maximum match to the signal using one fundamental frequence (3 parameters) on one linear trend (2 parameters), then I agree, a global search will find a solution to a signal with at least 5 sample points. 5 equations, 5 unknowns It doesn’t mean it is the right solution, but it will find one. It presupposes a great deal about the solution.
What you describe seems to be similar to the Periodisity Transform described by Willis in Cycles without the Mania Certainly we should not let a tool constrain our thinking about the correct components of the solution.
Stephen Rasey says:
My problem is that the detrend line between end points has little to do with any real trend in the data.
Exactly right but irrelevant. It is the deleterious effect of the trend on the DFT that we are trying to eliminate so that we can measure the spectral energy in the lower bins that would otherwise be masked by the record-edge discontinuity the trend give rise to.It exploits the fact that the Fourier transform is a linear operation to decompose the data into two DFT friendly components.
.
“Detrending changes the wavelengths of sub-cycle components”
True, but the effect is tractable and can be undone on the back end..
“What you describe seems to be similar to the Periodisity Transform”
No, it is a front-end technique that can be used with any form of spectral analysis. Think of SSA where the N-length data is decomposed into k N-length components via eigensystem analysis of the data’s covariance matrix. Clearly I can subtract any N point line (or any other N-point function for that matter) from the data and simply add it back in, point for point, to the lowest order SSA reconstruction. I haven’t changed the result one iota so what have I gained? Simply this. By periodizing the record prior to SSA I can get a much better spectral view of the reconstruction’s signal-to-noise ratio which can be used to optimize the choice of the window parameter and reconstruction eigen set.
Periodization is not quite that transparent with a DFT due to the Gibbs Effect, but these distortions are at high frequencies. In this case, periodization is tool to all us to trade better low frequency dynamic range for reduced high frequency DR.
“Certainly we should not let a tool constrain our thinking about the correct components of the solution.”
Amen brother.
Stephen Rasey says:
September 22, 2013 at 11:22 pm
What you describe seems to be similar to the Periodisity Transform described by Willis in Cycles without the Mania
Willis and I are in complete agreement with regard to Scafetta; given enough cycles I can reconstruct any curve (After all that is exactly what a cosine transform does). What struck me as surprising in my result was that so few cycles could give such a good fit, that the 3 main periodic signals were all harmonically related, and the phase stability of the 60 year cycle which still seems pretty astonishing – here it is again in the SST dataset this time.
The SSA, analyses however has tilted me toward’s Willis’ hypothesis that these are some sort of emergent, long-lived phenomena and not celestial forcing. The period of the mode in the SST (64 years) dataset is far enough removed from that found above in Hadcrut3 (58 years) to be explainable by uncertainty.
It turns out the periods are the same after all. See http://montpeliermonologs.wordpress.com/2013/09/21/periodicity-in-the-sst-record/ Now I’m back to leaning towards a celestial source. The analysis indicates 4.2 year lag between the sea surface temp and the global surface temp. Does anyone know if this in line with expectations?
@Jeff Patterson 7:58am
Your 4.2 year lag comes from a cross-corr plot with a broad maximum, where any value from 2 to 6 is above 0.90. But I think you are at least one step removed from the data. So I would not get hung up on the fractional year part of the question. It is not clear from the paragraph above, but your link indicates that global surface temp lags sea surface temp by 4 +/- 4 years.
Thanks. The time series clearly slows a persistent lag measurable in years not months. My question is does that make any physical sense?
@Jeff Patterson 10:50 am
does that make any physical sense?
And that’s a question the Ocean Heat Content folks are not asking very loudly. Otherwise, they wouldn’t get so worked up over 0.04 deg C of deep water warming.
A two year lag via Arctic Ocean Ice Cap minimums is easier to believe than a six year lag. Ten year half-life of an exponential decay function, with zero lag is easy to accept. But I find it difficult to visualize a four year impulse lag.
Maybe a x-year-lag is the wrong question. Perhaps it is a phase-lag. In much the way that the damping force of a pendulum seems out of phase with the pendulum’s position, but in phase with its velocity. Temp-Dot (dTemp/dt) is the Forcing, which can be governed by cloud formation. If we look at cloud formation over time, do we see an x-year lag between ocean and land?
Jeff Patterson:
Your latest calculations and speculations don’t begin to address the objections I raised earlier. Instead of performing SSA on the three 54-year segments of Hadcrut3, which would show the ABSENCE of seamlessly splicable “trends” and “harmonic periodicities,” you arbitrarily continue to
use the same stretch of that manufactured index, varying only the number of modes.You then point to a quaint “cross-spectral” comparison between the a low-order mode of SST component and the entire index, as if the handwaving agreement resolves all objections. And finally you claim that ergodicity is a property of noise, not of signal.
All of this merely makes apparent your inexperience in analyzing geophysical signals, which tend to be gaussian processes to first order. You might acquaint yourself with the work of S. O. Rice of Bell Labs to gain analytic comprehension of such signal processes. FYI, homogenous random wave fields are indeed ergodic. And the cross-spectral density as defined by Wiener-Kintchine is complex-valued, consisting of the coherence and phase, as well as the pair of power densities, instead of simply the magnitude, which you estimate without any concept of the degrees of freedom You ca’t rely upon your experiece in analyzing man-made signals for guidance here. And I can’t take any more time time to explicate well-established fundamentals whose essential grasp is persistently avoided
1sky1 says:
September 23, 2013 at 5:24 pm
Your latest calculations and speculations don’t begin to address the objections I raised earlier.
That;s because I haven’t tried to address the concerns you raised earlier :>) Patience. I did however cut the data in half and got the same result – a 60 cycle quasi periodic mode which when removed yields a constant .05 deg/dec regression slope to the residual.
“You then point to a quaint “cross-spectral” comparison between the a low-order mode of SST component and the entire index, ”
Not so. I calculated the cross-correlation between the low order reconstructions of the Hadcrut3 and SSTNH data, the standard way you look for similarity in two signals.
I think we have a confusion of terms here. The power spectral density of the cross correlation function is a measure of coherence so I’m not sure of the point you are trying to make. I drew no conclusions (because I haven;t done a cross-specrtral analysis) except to note that the resulting line width shows no sign of cross-modulation, indication a straight lag relationship as apposed to a multiplicative one.
“homogenous random wave fields are indeed ergodic”.- Which is why than can be safely assumed not to carry any information of interest. This is an exercise in small signal detection by process of elimination. We’re looking for a almost linearly increasing AGW that has supposedly been a persistent contribution to temperature for over fifty years. If we remove those ergodic modes that produce a non-persistent,contribution, and the oscillatory modes which are clearly not anthropogenic, we’re left with those components that must contain the signal of interest if it is to be found. I don’t see the flaw in this method. You evidently do but I can’t parse it from your remarks.
Thanks for taking the time and for the references.
Jeff Patterson:
The issue is not whether there is a ~60yr “quasi-periodic” component to be found in global or hemispheric SST temperature indices. The issue is that such narrow-band oscillations found in nature are essenially NON-deterministic. They cannot be realistically modeled by any finite set of sinusoids, irrespective of how well such a set may match a short stretch of record. Nor do they constitute “noise” in any physical sense of the word; they are an integral part of the temperature signal. They carry no less information than any longer-term components that are mistakenly taken for “non-ergodic” trends!
I will leave for you to discover what the (not-only-record-length-imposed) limitations of SSA are. Such decompositions are incabable of seperating signal from noise. As for your notions of what the cross-spectrum is analytically and what it reveals about the relationship between to signals, they are not only quaint, but also entirely misguided. You are the first analyst I’ve encountered in many decades of experience who believes that cross-spectrum analyzing SSA modes is “standard practice.”
In the forecast section you state:
“If for example, the harmonics are being generated by a stable non-linear climatic response to some celestial cycle, we would expect the model to have skill in forecasting future climate trends. On the other hand, if the periodicities are internally generated by the climate itself (e.g. feedback involving transport delays), we would expect both the fundamental frequency and importantly, the phase of the harmonics to evolve with time making accurate forecasts impossible.”
Followed by:
“We assume the periodicity is externally forced and the climate response remains constant. ”
Is the available data back to 1850 sufficient to identify whether or not the fundamental frequency and phase of the harmonics have evolved w/time? I.e. is it possible to rule in or rule out internal climate factors as dominant? This would not impact judgments wrt external forcings much, but it would be an indicative result, and would lend additional credence to the applied assumption that periodicity is externally forced.
Duane Oldsen says:
September 25, 2013 at 3:00 am
Is the available data back to 1850 sufficient to identify whether or not the fundamental frequency and phase of the harmonics have evolved w/time?
No, that’s why the forecast should be taken with a grain of salt. It is only as good as the assumptions stated.
Best
Jeff