Riding a Pseudocycle

Guest Post by Willis Eschenbach

Loehle and Scafetta recently posted a piece on decomposing the HadCRUT3 temperature record into a couple of component cycles plus a trend. I disagreed with their analysis on a variety of grounds. In the process, I was reminded of work I had done a few years ago using what is called “Periodicity Analysis” (PDF).

A couple of centuries ago, a gentleman named Fourier showed that any signal could be uniquely decomposed into a number of sine waves with different periods. Fourier analysis has been a mainstay analytical tool since that time. It allows us to detect any underlying regular sinusoidal cycles in a chaotic signal.

Figure 1. Joseph Fourier, looking like the world’s happiest mathematician

While Fourier analysis is very useful, it has a few shortcomings. First, it can only extract sinusoidal signals. Second, although it has good resolution as short timescales, it has poor resolution at the longer timescales. For many kinds of cyclical analysis, I prefer periodicity analysis.

So how does periodicity analysis work? The citation above gives a very technical description of the process, and it’s where I learned how to do periodicity analysis. Let me attempt to give a simpler description, although I recommend the citation for mathematicians.

Periodicity analysis breaks down a signal into cycles, but not sinusoidal cycles. It does so by directly averaging the data itself, so that it shows the actual cycles rather than theoretical cycles.

For example, suppose that we want to find the actual cycle of length two in a given dataset. We can do it by numbering the data points in order, and then dividing them into odd- and even-numbered data points. If we average all of the odd data points, and we average all of the even data, it will give us the average cycle of length two in the data. Here is what we get when we apply that procedure to the HadCRUT3 dataset:

Figure 2. Periodicity in the HadCRUT3 global surface temperature dataset, with a cycle length of 2. The cycle has been extended to be as long as the original dataset.

As you might imagine for a cycle of length 2, it is a simple zigzag. The amplitude is quite small, only plus/minus a hundredth of a degree. So we can conclude that there is only a tiny cycle of length two in the HadCRUT3.

Next, here is the same analysis, but with a cycle length of four. To do the analysis, we number the dataset in order with a cycle of four, i.e. “1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4 …”

Then we average all the “ones” together, and all of the twos and the threes and the fours. When we plot these out, we see the following pattern:

Figure 3. Periodicity in the HadCRUT3 global surface temperature dataset, with a cycle length of 4. The cycle has been extended to be as long as the original dataset.

As I mentioned above, we are not reducing the dataset to sinusoidal (sine wave shaped) cycles. Instead, we are determining the actual cycles in the dataset. This becomes more evident when we look at say the twenty year cycle:

Figure 4. Periodicity in the HadCRUT3 dataset, with a cycle length of 20. The cycle has been extended to be as long as the original dataset.

Note that the actual 20 year cycle is not sinusoidal. Instead, it rises quite sharply, and then decays slowly.

Now, as you can see from the three examples above, the amplitudes of the various length cycles are quite different. If we set the mean (average) of the original data to zero, we can measure the power in the cyclical underlying signals as the sum of the absolute values of the signal data. It is useful to compare this power value to the total power in the original signal. If we do this at all possible frequencies, we get a graph of the strength of each of the underlying cycles.

For example, suppose we are looking at a simple sine wave with a period of 24 years. Figure 5 shows the sine wave, along with periodicity analysis in blue showing the power in each of the various length cycles:

Figure 5. A sine wave, along with the periodicity analysis of all cycles up to half the length of the dataset.

Looking at Figure 5, we can see one clear difference between Fourier analysis and periodicity analysis — the periodicity analysis shows peaks at 24, 48, and 72 years, while a Fourier analysis of the same data would only show the 24-year cycle. Of course, the apparent 48 and 72 year peaks are merely a result of the 24 year cycle. Note also that the shortest length peak (24 years) is sharper than the longest length (72-year) peak. This is because there are fewer data points to measure and average when we are dealing with longer time spans, so the sharp peaks tend to broaden with increasing cycle length.

To move to a more interesting example relevant to the Loehle/Scafetta paper, consider the barycentric cycle of the sun. The sun rotates around the center of mass of the solar system. As it rotates, it speeds up and slows down because of the varying pull of the planets. What are the underlying cycles?

We can use periodicity analysis to find the cycles that have the most effect on the barycentric velocity. Figure 6 shows the process, step by step:

Figure 6. Periodicity analysis of the annual barycentric velocity data.

The top row shows the barycentric data on the left, along with the amount of power in cycles of various lengths on the right in blue. The periodicity diagram at the top right shows that the overwhelming majority of the power in the barycentric data comes from a ~20 year cycle. It also demonstrates what we saw above, the spreading of the peaks of the signal at longer time periods because of the decreasing amount of data.

The second row left panel shows the signal that is left once we subtract out the 20-year cycle from the barycentric data. The periodicity diagram on the second row right shows that after we remove the 20-year cycle, the maximum amount of power is in the 83 year cycle. So as before, we remove that 83-year cycle.

Once that is done, the third row right panel shows that there is a clear 19-year cycle (visible as peaks at 19, 38, 57, and 76 years. This cycle may be a result of the fact that the “20-year cycle” is actually slightly less than 20 years). When that 19-year cycle is removed, there is a 13-year cycle visible at 13, 26, 39 years etc. And once that 13-year cycle is removed … well, there’s not much left at all.

The bottom left panel shows the original barycentric data in black, and the reconstruction made by adding just these four cycles of different lengths is shown in blue. As you can see, these four cycles are sufficient to reconstruct the barycentric data quite closely. This shows that we’ve done a valid deconstruction of the original data.

Now, what does all of this have to do with the Loehle/Scafetta paper? Well, two things. First, in the discussion on that thread I had said that I thought that the 60 year cycle that Loehle/Scafetta said was in the barycentric data was very weak. As the analysis above shows, the barycentric data does not have any kind of strong 60-year underlying cycle. Loehle/Scafetta claimed that there were ~ 20-year and ~ 60-year cycles in both the solar barycentric data and the surface temperature data. I find no such 60-year cycle in the barycentric data.

However, that’s not what I set out to investigate. I started all of this because I thought that the analysis of random red-noise datasets might show spurious cycles. So I made up some random red-noise datasets the same length as the HadCRUT3 annual temperature records (158 years), and I checked to see if they contained what look like cycles.

A “red-noise” dataset is one which is “auto-correlated”. In a temperature dataset, auto-correlated means that todays temperature depends in part on yesterday’s temperature. One kind of red-noise data is created by what are called “ARMA” processes. “AR” stands for “auto-regressive”, and “MA” stands for “moving average”. This kind of random noise is very similar observational datasets such as the HadCRUT3 dataset.

So, I made up a couple dozen random ARMA “pseudo-temperature” datasets using the AR and MA values calculated from the HadCRUT3 dataset, and I ran a periodicity analysis on each of the pseudo-temperature datasets to see what kinds of cycles they contained. Figure 6 shows eight of the two dozen random pseudo-temperature datasets in black, along with the corresponding periodicity analysis of the power in various cycles in blue to the right of the graph of the dataset:

Figure 6. Pseudo-temperature datasets (black lines) and their associated periodicity (blue circles). All pseudo-temperature datasets have been detrended.

Note that all of these pseudo-temperature datasets have some kind of apparent underlying cycles, as shown by the peaks in the periodicity analyses in blue on the right. But because they are purely random data, these are only pseudo-cycles, not real underlying cycles. Despite being clearly visible in the data and in the periodicity analyses, the cycles are an artifact of the auto-correlation of the datasets.

So for example random set 1 shows a strong cycle of about 42 years. Random set 6 shows two strong cycles, of about 38 and 65 years. Random set 17 shows a strong ~ 45-year cycle, and a weaker cycle around 20 years or so. We see this same pattern in all eight of the pseudo-temperature datasets, with random set 20 having cycles at 22 and 44 years, and random set 21 having a 60-year cycle and weak smaller cycles.

That is the main problem with the Loehle/Scafetta paper. While they do in fact find cycles in the HadCRUT3 data, the cycles are neither stronger nor more apparent than the cycles in the random datasets above. In other words, there is no indication at all that the HadCRUT3 dataset has any kind of significant multi-decadal cycles.

How do I know that?

Well, one of the datasets shown in Figure 6 above is actually not a random dataset. It is the HadCRUT3 surface temperature dataset itself … and it is indistinguishable from the truly random datasets in terms of its underlying cycles. All of them have visible cycles, it’s true, in some cases strong cycles … but they don’t mean anything.

APPENDIX:

I did the work in the R computer language. Here’s the code, giving the “periods” function which does the periodicity function calculations. I’m not that fluent in R, it’s about the eighth computer language I’ve learned, so it might be kinda klutzy.

#FUNCTIONS

PI=4*atan(1) # value of pi

dsin=function(x) sin(PI*x/180) # sine function for degrees

regb =function(x) {lm(x~c(1:length(x)))[[1]][[1]]} #gives the intercept of the trend line

regm =function(x) {lm(x~c(1:length(x)))[[1]][[2]]} #gives the slope of the trend line

detrend = function(x){ #detrends a line

x-(regm(x)*c(1:length(x))+regb(x))

}

meanbyrow=function(modline,x){ #returns a full length repetition of the underlying cycle means

rep(tapply(x,modline,mean),length.out=length(x))

}

countbyrow=function(modline,x){ #returns a full length repetition of the underlying cycle number of datapoints N

rep(tapply(x,modline,length),length.out=length(x))

}

sdbyrow=function(modline,x){ #returns a full length repetition of the underlying cycle standard deviations

rep(tapply(x,modline,sd),length.out=length(x))

}

normmatrix=function(x) sum(abs(x)) #returns the norm of the dataset, which is proportional to the power in the signal

# Function “periods” (below) is the main function that calculates the percentage of power in each of the cycles. It takes as input the data being analyzed (inputx). It displays the strength of each cycle. It returns a list of the power of the cycles (vals), along with the means (means), numner of datapoints N (count), and standard deviations (sds).

# There’s probably an easier way to do this, I’ve used a brute force method. It’s slow on big datasets

periods=function(inputx,detrendit=TRUE,doplot=TRUE,val_lim=1/2) {

x=inputx

if (detrendit==TRUE) x=detrend(as.vector(inputx))

xlen=length(x)

modmatrix=matrix(NA, xlen,xlen)

modmatrix=matrix(mod((col(modmatrix)-1),row(modmatrix)),xlen,xlen)

countmatrix=aperm(apply(modmatrix,1,countbyrow,x))

meanmatrix=aperm(apply(modmatrix,1,meanbyrow,x))

sdmatrix=aperm(apply(modmatrix,1,sdbyrow,x))

xpower=normmatrix(x)

powerlist=apply(meanmatrix,1,normmatrix)/xpower

plotlist=powerlist[1:(length(powerlist)*val_lim)]

if (doplot) plot(plotlist,ylim=c(0,1),ylab=”% of total power”,xlab=”Cycle Length (yrs)”,col=”blue”)

invisible(list(vals=powerlist,means=meanmatrix,count=countmatrix,sds=sdmatrix))

}

# /////////////////////////// END OF FUNCTIONS

# TEST

# each row in the values returned represents a different period length.

myreturn=periods(c(1,2,1,4,1,2,1,8,1,2,2,4,1,2,1,8,6,5))

myreturn$vals

myreturn$means

myreturn$sds

myreturn$count

#ARIMA pseudotemps

# note that they are standardized to a mean of zero and a standard deviation of 0.2546, which is the standard deviation of the HadCRUT3 dataset.

# each row is a pseudotemperature record

instances=24 # number of records

instlength=158 # length of each record

rand1=matrix(arima.sim(list(order=c(1,0,1), ar=.9673,ma=-.4591),

n=instances*instlength),instlength,instances) #create pseudotemps

pseudotemps =(rand1-mean(rand1))*.2546/sd(rand1)

# Periodicity analysis of simple sine wave

par(mfrow=c(1,2),mai=c(.8,.8,.2,.2)*.8,mgp=c(2,1,0)) # split window

sintest=dsin((0:157)*15)# sine function

plotx=sintest

plot(detrend(plotx)~c(1850:2007),type=”l”,ylab= “24 year sine wave”,xlab=”Year”)

myperiod=periods(plotx)

0 0 votes

Article Rating

416 Comments

Andrew Marvell

July 30, 2011 2:01 am

http://www.theonion.com/video/nations-climatologists-exhibiting-strange-behavior,21009/

Ed Zuiderwijk

July 30, 2011 2:05 am

Have a look into wavelet analysis.

stephen richards

July 30, 2011 2:26 am

Good stuff Willis. I did fourier and laplasse functions many years ago. You have explained this particular use as well as anyone I remember.

Katherine

July 30, 2011 2:40 am

Willis, thanks for making this easy for a layperson to follow. Food for thought, indeed.
Just one thing. You wrote: Note also that the shortest length peak (24 years) is sharper than the longest length (72-year) peak. This is because there are fewer data points to measure and average when we are dealing with longer time spans, so the sharp peaks tend to broaden with increasing cycle length.
But later you wrote: It also demonstrates what we saw above, the spreading of the peaks of the signal at longer time periods because of the decreasing amount of data.
Based on the former, I think the latter should be “It also demonstrates what we saw above, the spreading of the peaks of the signal at longer time periods because of the increasing amount of data.”

Sean Houlihane

July 30, 2011 2:40 am

Numerology? Why? – your graphs of cycles have ZERO basis to have the original time scale applied under them. Sorry, but it appears that you don’t understand the analysis you are doing here and that doesn’t inspire to try and work out what your point is.

oMan

July 30, 2011 3:00 am

Very clear and informative. Thanks much. As a layman, I appreciate this mini-tutorial, which shows a powerful analytic tool and a reminder of how easily our eyes and mind find patterns, only some of which are really there.

Willis Eschenbach

Author

July 30, 2011 3:08 am

Katherine says:
July 30, 2011 at 2:40 am

Willis, thanks for making this easy for a layperson to follow. Food for thought, indeed.
Just one thing. You wrote: Note also that the shortest length peak (24 years) is sharper than the longest length (72-year) peak. This is because there are fewer data points to measure and average when we are dealing with longer time spans, so the sharp peaks tend to broaden with increasing cycle length.
But later you wrote: It also demonstrates what we saw above, the spreading of the peaks of the signal at longer time periods because of the decreasing amount of data.
Based on the former, I think the latter should be “It also demonstrates what we saw above, the spreading of the peaks of the signal at longer time periods because of the increasing amount of data.”

Thanks, Katherine. Actually, the amount of data decreases as the cycle length increases. Consider a cycle of length two. Each one is the average of half the data.
Now consider a cycle length of three. Each one is the average of a third of the data.
As you can see, increasing the cycle length decreases the amount of data available to average for each point in the cycle.
w.

Willis Eschenbach

Author

July 30, 2011 3:12 am

Sean Houlihane says:
July 30, 2011 at 2:40 am

Numerology? Why? – your graphs of cycles have ZERO basis to have the original time scale applied under them. Sorry, but it appears that you don’t understand the analysis you are doing here and that doesn’t inspire to try and work out what your point is.

Sean, it sounds like you didn’t understand the analysis I am doing here. If you have questions, I’m happy to answer them.
w.

Derek Sorensen

July 30, 2011 3:12 am

You appear to have used a mathematical algorithm to generate the random data. Is it not possible that there is an element of non-randomness here, which is why you see the periods you do?
I’d be more likely to be convinced if you had used truly random data. Why not grab some datasets here and run the tests again: http://www.random.org/

Willis Eschenbach

Author

July 30, 2011 3:24 am

Derek Sorensen says:
July 30, 2011 at 3:12 am

You appear to have used a mathematical algorithm to generate the random data. Is it not possible that there is an element of non-randomness here, which is why you see the periods you do?
I’d be more likely to be convinced if you had used truly random data. Why not grab some datasets here and run the tests again: http://www.random.org/

Derek, I have used datasets which are not simple random normal datasets. Those kinds of random datasets are called “white noise”. However, they do not resemble observational datasets of things like temperature.
This is because temperature datasets are “autocorrelated”, meaning that today’s temperature depends in part on yesterday’s temperature. Datasets composed of autocorrelated noise are called “red noise” datasets.
Note that the quality of these datasets is that they don’t take huge jumps, say from way up high to way down low and back again. Instead, they follow something like a “drunkard’s walk”, a random trail that, although not taking large jumps, wanders widely. Because today’s temperature depends in part on yesterday’s temperature, the list of temperatures forms a trail, not just random white noise.
It is entirely possible to generate random “red noise” datasets. They do not, as you suggest, contain the periods. Instead, because they are a trail and they wander up and down, pseudo-cycles are quite common. It looks like cycles … but it’s not.
w.

Mike Jonas

Editor

July 30, 2011 3:25 am

Willis – do you have anything in your toolbox that would handle cycles of varying length, such as the solar cycle or the PDO?

commieBob

July 30, 2011 3:30 am

The practical use of Fourier Analysis has a really nasty problem that most people aren’t aware of. It is called spectral leakage and it isn’t dealt with in many/most digital signal processing texts. The fact that the dataset is necessarily truncated produces spurious frequencies and the result is that the analysis can be complete garbage. A time series of fewer that 200 data points (one for each year) is a problem. You will see frequencies that aren’t there and may miss important frequencies because of spectral leakage. A good reference is The Scientist and Engineer’s Guide to Digital Signal Processing By Steven W. Smith, Ph.D. http://www.dspguide.com/pdfbook.htm It is written with the idea that you might actually want to do some digital signal processing and deals with the gotchas that will probably bite you if you don’t know about them.
The fact that Fourier Analysis finds sine waves isn’t a problem. A sine wave is is the basic unit of oscillation. It isn’t theoretical, it is really there. Any other repeating waveform can be broken down to its component sine waves. If my spectrometer tells me that a frequency exists, I can tune into that frequency with a radio and find that it does indeed exist.

Stephen Wilde

July 30, 2011 3:53 am

That seems to be a mathematical way of showing that the cycles we think we see in the historical records and the proxies are not ‘significant’.
But does that matter?
On human timescales the changes that appear to arise from apparent climate cycling are real and in the past have led to the rise and fall of civilisations.
From our perspective there is enough ‘significance’ in those allegedly spurious cycles to give us a degree of predictive ability albeit within a wide range and albeit not guaranteed.
Thus if the sun gets a bit less active the vertical temperature profile of the atmosphere changes with effects on surface pressure distribution and likewise but from the bottom up when ocean surfaces are in heat releasing mode.
So who cares about statistical exercises when the linkages are so apparent?

Richard Saumarez

July 30, 2011 3:54 am

The peak at 60+ period is almost certainly a function of record length,which appears to be 130 years from your plots. This is a well known problem in any orthogonal, or non-orthogal transform and there are well-known ways of eliminating it.
Since the periodic transform uses a non-orthogonal basis set, the physical interpretation of the periodicities is difficult. To decompose a signal into a limited set of non-orthogonal basis functions and then reconstruct it (with some error presumably) tells us nothing about the signal. The PT is closely related to frequency domain template matching, where phase dependence can be specified, appears to give better understanding to structure of the signal.
Your thesis is that a 60 year periodicity in the HADCRUT data is an artefact of the processing methods. I am not convinced that you have done this correctly and I would be interested to see the results of more established methods, applied correctly, to your data set. If there is a 60 year periodity in the data, it should be possible to establish this, although the statistical significance is likely to be low as it is only twice the fundamental frequency of the data. Since you have appeared to use an ensemble of the HADCRUT data, a stratified approach may yield more information.

Robert of Ottawa

July 30, 2011 3:59 am

Willis Eschenbach says:
July 30, 2011 at 3:24 am
It looks like cycles … but it’s not.
Which was the whole point of the article I believe. The human mind looks for patterns where there are not necessailry any – both a gift and curse.

Derek Sorensen

July 30, 2011 4:02 am

Willis Eschenbach says:
July 30, 2011 at 3:24 am
Fair enough, and thanks for the explanation.

Henry

July 30, 2011 4:18 am

Willis. You seem to be use “mod”, possibly from the matlab package. You might find “%/%” does integer division.

Kasuha

July 30, 2011 4:21 am

This is very impressive analysis and explanation, many thanks for it.
I wonder what fits would it make if the periodicity analysis was extended into non-integer domain (i.e. to cover also periods like 19.7 etc)

R. de Haan

July 30, 2011 4:22 am

Great article.
Looking forward to a response from Loehle and Scafetta here at WUWT.

Nick Stokes

July 30, 2011 4:26 am

Willis,
I think the problem here is that you are dealing with non-orthogonal functions, and seeking to partition the power between them. In Fourier analysis you can do that, because of orthogonality, but without, you can’t.
The power of the sum of two frequencies, v1 and v2, say, is (v1+v2)^2. And this adds as v1^2 and v2^2 only if the average power in the product v1*v2 is zero – orthogonality.
You might like to check Ed Z’s comment. I think what you are describing is a fast wavelet transform, which works by repeated doubling of the interval length.
What’s in a name? Just that wavelet transforms are very popular, and there is a stack of R packages to help. Eg wavelets.
I note that your IEEE paper anticipates that and points out that their version doubles the period rather than the frequency. But with the fast transform based on doubling, this is just looking at the same sequence in reverse.

James Reid

July 30, 2011 4:33 am

Willis, my (limited) understanding of the limitations of average global atmospheric temperature data sets seems that they might not be very representative of real natural processes. I wonder what might happen if you apply the same technique to say the CET data set? I would attempt it myself, however, I suspect it would take me much longer than you… you seem to have the tools at your fingertips and neurons to spare :-).

David

July 30, 2011 4:43 am

The ‘drunkards walk’ is Brownian motion isn’t it? Or a random walk. A stock price can be said to follow such a motion (chartists may disagree – not too may wealthy chartists).

Katherine

July 30, 2011 5:01 am

As you can see, increasing the cycle length decreases the amount of data available to average for each point in the cycle.
I see. Thanks for the clarification.

polistra

July 30, 2011 5:15 am

When I worked in speech research I used a similar brute force method in analyzing voice waves to locate glottal jitter. Fourier doesn’t work well on these waves because the basic pulse is NOT sinusoidal. Contrary to an earlier comment, the sine wave is not “always there”, it’s always an artifact of Fourier. In a semi-coupled system like the glottis and articulators, Fourier generates a lot of ‘frequencies’ that simply aren’t part of the real process. They only get in the way of a meaningful analysis.
As I recall (25 years later), my algorithm was fairly close to an LPC method, though I didn’t start out trying to implement LPC. Most good speech analysis uses LPC in one way or another.

Jeff L

July 30, 2011 5:20 am

Willis,
Could you post a picture of the auto correlation of your random data sets & the real data set & use those plots to show the difference between a “real” signal & an artifact? TIA