Image Credit: Climate Data Blog
By Richard Linsley Hood – Edited by Just The Facts
The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.
Before we get too far into this, let me explain some of the terminology that will be used in this article:
—————-
Filter:
“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.
Gaussian Filter:
A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.
Full Kernel Filter:
Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.
Low Pass Filter:
A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.
Cascaded:
A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.
Triple Running Mean:
The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.
—————
With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.
Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:
Fig 1: UAH data with an Annual CTRM filter
Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.
The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:
Fig 2: HadCRUT4 data with an Annual CTRM filter
Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.
There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.
The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.
When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.
The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:
Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.
Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.
Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.
And that Full Kernel problem? We can add a Savitzky-Golay filter to the set, which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).
Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends
We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…
And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.
Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.
About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.





Bernie Hutchins says:
March 18, 2014 at 10:21 am
“I guess I still don’t have much of a clue what you are doing!”
Utilising work done by others to create a high quality filter analysis of the available climate data.
“What you do is really quite (unnecessarily?) complicated, more complicated that SG or Gaussian, etc., but still FIR and you should be able to provide the impulse responses of the individual stages you are cascading. That would be complete, concise, well defined, and unambiguous. Is that available?”
Not really complicated. Just cascading the single stages in an approved fashion to get better responses. The CTRM came from Greg and Vaughn, the S-G from standard utilisation of that to match to the CTRM.
CTRM is just so easy and is so near to Gaussian that the differences hardly matter. Gaussian is the ideal filter to work with on almost all signal work. Usually has too slow a roll-off but in climate no-one is looking for 1/3 octave filters or better!
I am fairly sure that you don’t need the full 5 stage cascade for the S-G. You do need 3 to get to a reasonable match but the extra 2 are probably not really required. I left them in mainly because Nate was SO determined to prove me wrong about the CTRM and they do no real harm – and I do like to credit him with the original as you may see in the R at my blog:-).
The underlying reasoning is to use high quality filters to do the analysis. The particular values came out of seeing what worked and looking at what was there.
The 15 year corner came from a standard low pass sweep up the available bandwidth as you would do in any other signal work and looking for the cut point.
If you have any suggestions of how to improve/simplify the various filters or their responses then I would welcome the advice.
RichardLH says:
March 18, 2014 at 9:38 am
You would think so. But I am told it is all down to co-incidental Volcanos, SO2 and CO2 in just the right mixture and timing 🙂
Yes, our old friend, the epicycles. Plus ca change…
Bart says:
March 18, 2014 at 10:54 am
“Yes, our old friend, the epicycles. Plus ca change…”
There is a LONG way to go to get to there. It does rather require a physical mechanism with sufficient capability which currently is not there, as far as I am concerned. Lots of suggestions, but nothing that really seems valid in detail.
The response does look very cyclic and that would suggest orbital but HOW is then the problem.
Richard –
It is evident that we are on quite different wavelengths! I do not understand what the limitations are on your resources such that you see an advantage. You seem to conclude that your method is almost as good as Gaussian (converges on Gaussian?). So – Why not use Gaussian?
Indeed, why not use any of the long-available superior alternatives: classic windowing, frequency domain least squares, Parks-McClellan, generalized Frequency Sampling, SG, and probably a half dozen others? Further, these methods already push the performance into the corners (trade-offs) – limited by such things as the uncertainty relationship for Fourier-transform-related signals.
My only suggestions would be to try these standard things and compare them, FIRST. Then try your method and see if it is really better in a sense you can elaborate.
Moreover – assuming a reliable outcome, what have you discovered that is new? As Bart says above, essentially, “We already see the periodicity”. No mathematical method that detects periodicity (filtering, spectral analysis) speaks even indirectly to the origins of that periodicity.
And – you forgot to mention if you have the impulse responses I asked about. If not, just please say so.
Bernie Hutchins says:
March 18, 2014 at 11:07 am
“It is evident that we are on quite different wavelengths! I do not understand what the limitations are on your resources such that you see an advantage. You seem to conclude that your method is almost as good as Gaussian (converges on Gaussian?). So – Why not use Gaussian?”
Because a CTRM is a good enough match not to need that rather more complex routine? The answer will be the same. Differences between the various filter types are not going to be sufficient to alter any of the outcomes in any case (other than in a worse direction).
“Indeed, why not use any of the long-available superior alternatives: classic windowing, frequency domain least squares, Parks-McClellan, generalized Frequency Sampling, SG, and probably a half dozen others? Further, these methods already push the performance into the corners (trade-offs) – limited by such things as the uncertainty relationship for Fourier-transform-related signals. ”
There is no need to get a ‘better’ response than Gaussian and I am not sure that one is available in any case. Please, if you feel that there IS a better alternative, then the data is freely available so run one of your choice and see how it differs. My betting is not enough to notice.
“My only suggestions would be to try these standard things and compare them, FIRST. Then try your method and see if it is really better in a sense you can elaborate. ”
Other than true Gaussian (which I have run) the ones I am currently using seem to be sufficient to the task. If you feel there is a need to do a complete set of alternatives then go for it. I have enough on my plate trying to get through the various data sets
with the two I have right now.
“Moreover – assuming a reliable outcome, what have you discovered that is new? As Bart says above, essentially, “We already see the periodicity”. No mathematical method that detects periodicity (filtering, spectral analysis) speaks even indirectly to the origins of that periodicity. ”
In that case then the ~60 year natural ‘cycle’ is established and all the other stuff needs more detailed looking at. Things such as the various parts of the UAH signal.
http://climatedatablog.files.wordpress.com/2014/02/uah-global.png
Quite intriguing and a source of investigation/work to come.
“And – you forgot to mention if you have the impulse responses I asked about. If not, just please say so.”
No. But as the response curve is so nearly Gaussian I would be incredibly surprised in there are any significant or worthwhile differences between the two of them.
I am always perplexed by the insistence that “we don’t have an identified mechanism”. You don’t need to know how a diesel engine works to know you better get out of the way of the train.
We have a situation of two scientists. Scientist #1 has a hypothesis, but no evidence to back it up. Scientist #2 has no hypothesis, but hard data which demonstrate a phenomenon contradicting #1’s hypothesis. Which to choose, oh, which to choose…
Cyclic phenomena abound in nature. They’re everywhere. Find some piece of metal near you and tap it. Does it not ring? Pour some water in a glass. Does it rise uniformly?
Climate scientists would do better spending their time looking for the mechanism than denying that it is lying in plain sight before them.
…that its signature is lying in plain sight…
Richard –
I sincerely admire your persistence.
But you have nothing.
Just my thanks.
Bernie
Bart says:
March 18, 2014 at 11:39 am
“I am always perplexed by the insistence that “we don’t have an identified mechanism”. You don’t need to know how a diesel engine works to know you better get out of the way of the train.”
But knowing the science that unpins it allows you to run in the right direction!
Bernie Hutchins says:
March 18, 2014 at 11:43 am
“Richard – I sincerely admire your persistence. ”
Thanks.
“But you have nothing.”
Really? A nice way of showing what is actually there with sufficient precision not to be refuted. Nothing? A pity you think so.
“Just my thanks.”
Thank you in return.
Bernie Hutchins:
On March 18, 2014 at 11:07 am, you list a panoply of well-established alternatives to the home-brewed filters touted here and follow up with: “My only suggestions would be to try these standard things and compare them, FIRST. Then try your method and see if it is really better in a sense you can elaborate. Moreover – assuming a reliable outcome, what have you discovered that is new?”
While all your points are spot on the mark and would be acknowleged as such in a professional technical dicussion, that is scarcely the purpose or tenor of a dog and pony show intended to establish “street cred” in blogland. Only the most naive, however, will buy the self-promoting notion that Pratt, Goodman or Hood have produced anything new or useful.
1sky1 says:
March 18, 2014 at 3:00 pm
“Only the most naive, however, will buy the self-promoting notion that Pratt, Goodman or Hood have produced anything new or useful.”
Hmmm. That was just establishing the validity of the methods and demonstrating quite clearly that there is something in the ~60 year bracket that HAS happened and needs explaining.
Next (assuming that Anthony will allow) we have a comparison/alignment of the 4 major temperature sets to allow a combined satellite/thermometer set stretching back to the 1800s with the ability to directly compare/contrast the various sources and some possible lines of enquiry from that.
Then, against assuming permission, an analysis of UAH and HadCRUT across the globe with an area by area breakdown and direct comparison between them. Raises some interesting questions about possible UHI contamination which will be hard to dismiss.
Want to see?
1sky1 says:
March 18, 2014 at 3:00 pm
P.S. You do know who Vaughan Prat is I suppose?
I am led to believe it is this bloke 🙂 http://en.wikipedia.org/wiki/Vaughan_Pratt
EDIT: …Then, again assuming permission, …
and
..who Vaughan Pratt is …
combination of fingers and auto complete.
Jeff Patterson says:
March 18, 2014 at 6:56 am
I do love all the theory … Greg talks about “differing slopes” and “pure cosines”. I invite him to demonstrate it. He hasn’t replied.
You say my analysis is “bogus”, for the same reason Greg posited … but you don’t give us a worked out example using the actual data either.
I go about it the other way. I do the exercise myself, run the numbers, and then I think about the theory. Depending on how you choose to count (peaks or valleys) using the detrended HadCRUT4 data (detrended as Richard recommended with a 60-year S-G filter) we end up with cycles of 51, 61, 64, and 67 years. That’s not a cycle of any kind, not approximately sixty years, not approximately anything.
So I’m sorry, Jeff, but as much as I generally respect your science-fu, a range from 51 to 67 years is a huge difference, and so I will say it again. There is nothing in the data to indicate that there is any kind of regular cycle in the data. Or as I put it above, “It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase.”
NOTE THAT I AM NOT SAYING THAT THERE IS NO REGULAR CYCLE IN THE TEMPERATURE.There may be such a cycle, although to my knowledge no one has ever shown it with any solid evidence. But as always lack of evidence is not evidence of a lack.
I am saying it is a mistake to ASSUME that there is such a regular cycle and proceed with your analysis on that basis.
w.
Willis Eschenbach says:
March 18, 2014 at 6:25 pm
we end up with cycles of 51, 61, 64, and 67 years. That’s not a cycle of any kind, not approximately sixty years, not approximately anything.
Is it possible that there really is a 60 year cycle out there which just happens to be masked and pushed or pulled in one direction or another due to other cycles? For example, one could plot the times when all ice is gone from Lake Superior and while we know about the 365.24 day cycle when the sun comes around to the same point, we may find that sometimes melt times may be 10 months apart and at other times 14 months apart for example.
I think nobody assumes that the cycles are regular. The cycles are quasiperiodic – oscillations that appear to follow a regular pattern but which do not have a fixed period. Just like the solar cycle, which is ~11 year in average, but actually ~9 – 13 years long.
Willis Eschenbach says:

March 18, 2014 at 6:25 pm
“I go about it the other way. I do the exercise myself, run the numbers, and then I think about the theory. Depending on how you choose to count (peaks or valleys) using the detrended HadCRUT4 data (detrended as Richard recommended with a 60-year S-G filter) we end up with cycles of 51, 61, 64, and 67 years. That’s not a cycle of any kind, not approximately sixty years, not approximately anything.”
I think you may have missed to bit where I did the work you had requested (the 15 year CTRM de-trended with a 75 year S-G) and posted the result above. (This was just a quick exercise and posted straight from my clipboard so apologies for the lack of style)
http://snag.gy/47xt0.jpg
This shows a very ‘clean’ cycle at either 65 years (if you take the peaks) or 70 years (if you take the zero crossings).
Given that there is only one cycle truly possible to measure I’ll still call that ~60 years (I could make it ~67.5 if you want to be very technical).
I will stand by what that shows and note that future data may allow for a more precise estimate.
Edim says:
March 19, 2014 at 2:31 am
“I think nobody assumes that the cycles are regular. The cycles are quasiperiodic – oscillations that appear to follow a regular pattern but which do not have a fixed period. Just like the solar cycle, which is ~11 year in average, but actually ~9 – 13 years long.”
Indeed. I would be VERY surprised if this was a single cycle at all. Given the data my best guess is that this is most probably a combination of 55, 65 and 75 year cycles mixed in some semi-random combination of you forced me to decide right now.
That is right out there with just about a thousand other possibilities so should probably be given an appropriate weighting for likely accuracy.
wbrozek says:
March 18, 2014 at 7:43 pm
“For example, one could plot the times when all ice is gone from Lake Superior and while we know about the 365.24 day cycle when the sun comes around to the same point, we may find that sometimes melt times may be 10 months apart and at other times 14 months apart for example.”
If ther is indeed an ~60 year cycle to the data then the averages for a 30 year portion would be slightly longer than 365.25 and the rest slightly shorter. If that cycle its was modulated by others then the averages would change based on them also. It can get really, really messy and, given the semi random variations that will always exist, it can take a LONG time to be sure/definite about anything.
Willis Eschenbach says:

March 18, 2014 at 6:25 pm
And in case you were wondering how that other plot is derived from the HadCRUT series
http://snag.gy/Sx85m.jpg
As I also said before, I do not expect the underlying 75 year S-G to continue upwards indefinitely. I would expect an inflexion soon based on a likely projection of the current data to also make it into a ‘cycle’.
I do not accept that CO2 is the main driver to that longer trend. It may well be some portion of it however. The ratios are still up for grabs in my world.
Willis Eschenbach says:
March 18, 2014 at 6:25 pm
“You say my analysis is “bogus”, for the same reason Greg posited … but you don’t give us a worked out example using the actual data either.”
I didn’t think an example necessary since clearly a cosine wave (the derivative of a sine wave) oscillating about a non-zero mean (derivative of the slope) has zero-crossings (peaks of of the undifferentiated waveform) that are not equally spaced in time. Your method is simply not a valid measure of period. That’s why I said your analysis (not necessarily your conclusion) is bogus.
If you measured, instead of the half cycle periods you used, the peak-to-peak or trough to trough intervals you’d get a better estimate but as I said before, even this approach assumes the trend is constant across the measurement interval which is certainly not the case. For a worked out example that assumes the underlying trend is itself a low frequency sinusoid, see figure 5 of my first WUWT post. The red curve is a three-sine fit to the data (I know you object to the method, I’m just using it as an illustration). Note the lack of consistency in the timing of the peaks/troughs even though it is comprised of purely sinusoidal functions.
Warm regards (and my “science-fu” can’t hold a candle to yours but thanks)
JP
Jeff Patterson says:
March 19, 2014 at 12:55 pm
“If you measured, instead of the half cycle periods you used, the peak-to-peak or trough to trough intervals you’d get a better estimate but as I said before, even this approach assumes the trend is constant across the measurement interval which is certainly not the case”
That may be my fault I’m afraid. I prefer ‘zero crossing’ as a detection point because peaks are much more subject to outliers, etc.
As you can see I prefer to de-trend with a longer residual rather than eyeball though if I can (though I can do that as well if required).
I suspect it is just that Willis is not as good or experienced at doing it as I am. Not his fault really.
As you can see from the work I posted above, I’m am pretty good at doing that eyeball stuff as well :-).
You can of course use zero crossings as long as you use positive slope crossing to positive slope crossing (or neg-to-neg). But again, if the trend is non-constant over the measurement interval all bets are off. That’s why God gave us the periodogram :>)
RichardLH, do you have an email so I could drop you a note?