Image Credit: Climate Data Blog
By Richard Linsley Hood – Edited by Just The Facts
The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.
Before we get too far into this, let me explain some of the terminology that will be used in this article:
—————-
Filter:
“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.
Gaussian Filter:
A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.
Full Kernel Filter:
Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.
Low Pass Filter:
A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.
Cascaded:
A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.
Triple Running Mean:
The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.
—————
With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.
Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:
Fig 1: UAH data with an Annual CTRM filter
Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.
The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:
Fig 2: HadCRUT4 data with an Annual CTRM filter
Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.
There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.
The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.
When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.
The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:
Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.
Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.
Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.
And that Full Kernel problem? We can add a Savitzky-Golay filter to the set, which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).
Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends
We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…
And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.
Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.
About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.





Greg Goodman: No, filters separate information. What you “throw” away is your choice.
Marler: “Filtering” is nothing more than than fitting data by a method that uses a set of basis functions, and then separating the results into two components (as said by Greg Goodman),
Greg Goodman: I said nothing of the sort. Don’t use my name to back up your ignorant claims.
I forgot to add. I cited Greg Goodman as the source of the idea of “separating information”, not as an authority, because he mentioned it before I did.
Richard: Well whatever your logic there is something with some power in the 15 to 75 year bracket that the data says is present and needs explaining.
For a hypothesized signal you can estimate the noise. For a hypothesized noise, you can estimate a signal. Quoting Greg Goodman again: No, filters separate information. What you “throw” away is your choice. To paraphrase, which you call the “signal” and which you call the “noise” depends on other information, such as future data. When doing a decomposition into frequencies, there is no “automatic” or theory-free method to decide whether the “noise” resides in the high or low frequencies. You stated your hypotheses, and got the corresponding estimates. Which frequencies persist into the future will be discovered in the future. That you could get a decomposition to fit your hypotheses was guaranteed ( p = 1.0, so to speak, no matter what the true nature of the finite data set.)
I have a better and simplier forecast:
http://www.woodfortrees.org/plot/hadcrut4gl/from:1970/trend/plot/hadcrut4gl/from:1970/to:2001/trend/plot/hadcrut4gl/from:1970/to/plot/gistemp/from:1970/trend/plot/gistemp/from:1970/to:2001/trend/plot/gistemp/from:1970
Jeff Patterson says:
March 17, 2014 at 6:36 pm
“Assuming the signal is spectrally localizable, it is up to the user to decide what defines the “signal” of interest and to design a filter appropriate to the task, It should be pointed out that residual (the part of the input signal rejected by the LPF) by definition attenuates long term-trend information and thus all of this information, which is what we are after, is contained in the output. That is, I think, what Greg and Riochard mean by “nothing is lost”, although that’s not how I would phrase it.”
Indeed that is precisely it.
You can have a function
lowpass(signal) -> L
you can also either have a function
highpass(signal) -> H
or use
signal – L = H
in either case no information is ‘lost’. It is always in either L or H.
True with real, practical functions, then some part of ‘signal’ part could be in both L and H by mistake but, provided you ensure the corner frequency is chosen to stay away from points of interest, then no real problems occur.
Sure if you drop L or H on the floor and scrub some dirt over It with your foot then stuff is most definitely ‘lost’ but I do try not to do that 🙂
Bernie Hutchins says:
March 17, 2014 at 6:41 pm
“No one will argue information loss if this is your angle. Because no information IS lost.”
I know – but it doesn’t stop the mantra being rolled out.
ThinkingScientist says:
March 17, 2014 at 6:59 pm
“Just to let you let you know, I think you have been extraordinarily patient with some of the critics here.”
Thank you
“As a geophysicist myself, with more than a passing understanding of low and high pass filtering (and as an audophile and user of PA equipment including subwoofer design) I have no problem in understanding the utility of the simple (but clever) filter you have presented or the significance of the low pass/high pass threshold being a non-ciritcal choice over a wide range of frequencies. Makes sense to me.”
Me too. A very useful tool to explore with. In reality it doesn’t much matter if you use CTRM or a true Gaussian. The results will be nearly indistinguishable. The CTRM is just easier to construct and use by almost everybody. If you have the knowledge to use higher order stuff then that will produce similar results but it does require more knowledge and understanding and brings its own complications.
cd says:
March 17, 2014 at 7:05 pm
“Perhaps I’m missing it, but I don’t think he is using a discrete filter in the sense of the sample window.”
A moving average uses a sliding sample window, and this is just an extension of that base concept.
cd says:
March 17, 2014 at 7:13 pm
“Again, without all the gobbledigook (obfuscation) can you reconstruct the original signal from the post-processed series + the “inverse” convolution (what’s that you say, there is no inverse!).
So you can’t. OK then information is lost! And Willis was right!”
Straw man.
Lowpass(signal) -> L
Highpass(signal -> H
or
signal – L = H
no information is ‘lost’. Ever.
Willis Eschenbach says:
March 17, 2014 at 7:32 pm
“As a result, we can definitely improve the resolution as you point out regarding the Hubble telescope. But (as someone pointed out above) if you are handed a photo which has a gaussian blur, you can never reconstruct the original image exactly.”
Straw man.
No one other than you, certainly not me, has ever claimed such.
I repeat what I said above to cd
Lowpass(signal) -> L
Highpass(signal) -> H
or
signal – L = H
no information is ‘lost’. Ever.
Jeff Patterson says:
March 17, 2014 at 8:57 pm
“Whether the “65 year cycle” is a near line spectrum of astronomical origin, a resonance in the climate dynamics, a strange attractor of an emergent, chaotic phenomena or a statistical anomaly, what it isn’t is man-made. It is however, responsible for the global warming scare. If it had not increased the rate of change of temperature during its maximum slope period of 1980′s and 90s, the temperature record would be unremarkable (see e.g. the cycle that changed history and we wouldn’t be here arguing the banalities of digital filter design.”
+10
Matthew R Marler says:
March 18, 2014 at 12:32 am
“Richard LH: Curve fitting is deciding a function and then applying it to the data.”
should read
Curve fitting is deciding a set of functions and then applying them to the data.
Sorry.
Matthew R Marler says:
March 18, 2014 at 12:52 am
“To paraphrase, which you call the “signal” and which you call the “noise” depends on other information, such as future data. When doing a decomposition into frequencies, there is no “automatic” or theory-free method to decide whether the “noise” resides in the high or low frequencies. ”
OK – definition time (for this article anyway)
Signal = anything longer than 15 years in period
Noise = anything shorter than 15 years in period
Splice says:
March 18, 2014 at 1:26 am
“I have a better and simplier forecast:”
As I have said before on many occasions (and I DO mean it)
Linear trend = Tangent to the curve = Flat Earth
The way you get to the conclusion that the Earth is flat is to concentrate on a local area and then extrapolate that observation to areas outside of the range the observations are drawn on.
Exactly the same is done if you apply a straight line (in a general OLS statistical fashion) to a set of temperature observations and then draw any conclusion about what it means (pun) outside of that range, future or past.
If you use a continuous function, such as a filter, then you are much closer to the ‘truth’ about was is, was and maybe about to happen.
RichardLH
no information is ‘lost’.
You’re shifting the goal posts. If all you have is the smoothed data then you’ve lost information. The original point stated this.
While not being an expert I am not a novice either, and what you’ve done here with regard to the above point is arm wave and obfuscate.
Again, I have no problem with your work.
cd says:
March 18, 2014 at 3:54 am
“You’re shifting the goal posts. If all you have is the smoothed data then you’ve lost information. The original point stated this.”
I am not. I have chosen to display only part of the information – that is true. That is not the same thing as losing information. The ‘other part’ is still there , just not displayed. If you like I can do the
signal-LowPass = HighPass
to display the high pass bit but that was not what this article was about.
RichardLH
Willis original point.
Any filter which is not uniquely invertible loses information.
Your reply here:
signal-LowPass = HighPass
Is an answer to a different question. Note that what Willis states that:
LowPass
Nothing else! It then follows that HighPass is the lost information and he is correct. As I said, it is a side issue, one that you could have dealt with a simple admission. It doesn’t even challenge your work, but it is annoying when you deal with a point with obfuscation.
cd says:
March 18, 2014 at 4:41 am
“RichardLH
Willis original point.
Any filter which is not uniquely invertible loses information.
Your reply here:
signal-LowPass = HighPass
Is an answer to a different question. Note that what Willis states that:
LowPass
Nothing else! It then follows that HighPass is the lost information and he is correct. As I said, it is a side issue, one that you could have dealt with a simple admission. It doesn’t even challenge your work, but it is annoying when you deal with a point with obfuscation.”
It is not obfuscation, it is fact.
True if you were to confine yourself to looking only at the LowPass, then it is only part of the information, so in that limited sense only the other part is ‘lost’. I have never claimed otherwise.
Fortunately we can look outside of that very,very narrow viewpoint and observe that the ‘lost’ information is still there, over in the HighPass bit as I have always done.
What part of those observations and conclusions do you disagree with?
cd:
Here is a plot of the High Pass for UAH data just so that you can be sure that nothing is ‘lost’
http://snag.gy/RtTA4.jpg
cd: And the distribution of the same
http://snag.gy/CrlUx.jpg
Reposted from
http://climatedatablog.wordpress.com/2014/03/15/r-code-for-simple-rss-graph/#comments
indpndnt says:
March 18, 2014 at 1:29 pm
I have replicated this work in Python, for those who prefer it or might find it useful. It can be found on GitHub at:
https://github.com/indpndnt/ipython/blob/master/ClimateSmoothing.py
—
Useful addition for those who wish to work in Python rather that R.
Thanks to ‘indpndnt’.
Willis Eschenbach says:
March 17, 2014 at 12:22 am
“It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase. Let me show you why, using your data:”
Sorry Willis but your analysis is bogus. The LPF doesn’t remove the trend. For the sake of argument assume the trend is linear and the filter output is simply a*t + B*Sin[wo*t + phi]. The derivative is a Cosine plus a constant and is thus not zero mean. The timing of zero crossings of the derivative (the points where you draw the vertical lines to show the supposed timing inconsistency) depends on the constant (slope of the trend) and will not be consistent unless the slope is zero. This effect becomes even more pronounces if there are other sub-harmonics present because these low frequency signals make the slope of the trend time dependent.
Matthew R Marler says:
March 18, 2014 at 12:52 am
Richard: Well whatever your logic there is something with some power in the 15 to 75 year bracket that the data says is present and needs explaining.
Matthw: To paraphrase, which you call the “signal” and which you call the “noise” depends on other information, such as future data. When doing a decomposition into frequencies, there is no “automatic” or theory-free method to decide whether the “noise” resides in the high or low frequencies.
Your first sentence is indecipherable (the signal definition depends on future data??). The second sentence is wrong. It is the AWG hypothesis that defines the signal of interest, namely a long-term temperature trend imposed on the natural variation of the climate which itself may contain a natural trend. Spectral decomposition, regardless of the technique used, throws away the high frequency data to improve the signal to noise ratio where each is defined as above. The main point is that by definition, the discarded “noise” is detrended and zero-mean and thus cannot contain any information about anthropogenic effects. Likewise the wiggles which remain, regardless of cause, are clearly of natural origin, Remove them as well and the temperature record shows a remarkably consistent ~.5 degC/century slope since at least 1850 with no discernible dependency on CO2 concentration.
“Which frequencies persist into the future will be discovered in the future. That you could get a decomposition to fit your hypotheses was guaranteed ( p = 1.0, so to speak, no matter what the true nature of the finite data set.)”
You still misunderstand the difference between filtering and curve fitting but no matter. Prediction of the climate is a fool’s errand. Since there is no detectable effect of CO2 in the temperature record the motive for forecasting is obviated. The climate will do what it has always done- meander about.
Jeff – there are obviously many motive sfor forecasting future climate without regard to CO2. If for example we have reason to believe that we might expect a significant cooling -then this would suggest various actions in mitigation or adaptation.It is very obvious- simply by eye balling the last 150 years of temperature data that there is a 60 year natural quasi periodicity at work. Sophisticated statistical analysis actually doesn’t add much to eyeballing the time series. The underlying trend can easily be attributed to the 1000 year quasi periodicity. See Figs 3 and 4 at
http://climatesense-norpag.blogspot.com
The 1000 year period looks pretty good at 10000,9000,8000,7000, 2000.1000. and 0
This would look good I’m sure on a wavelet analysis with the peak fading out from 7000- 3000.
The same link also provides an estimate of the timing and extentof possible future cooling using the recent peak as a synchronous peak in both the 60 and 1000 year cycles and the neutron count as supporting evidence of a coming cooling trend as it appears the best proxy for solar “activity” while remaining agnostic as to processes involved.
Some suggestions for RichardLH
1) Plot the frequency response in dB. This allows much better dynamic range in observing how far down the sidelobes and passband ripples are.
2) Run randomly generated data through the filter and look at the power spectral density of the output. If the PSD is flat in the passband, the filter is exonerated as the source of the wiggles.
3) Show the step response of the filter, (not the theoretical) by running a step through the actual filter and plotting the result to prove there is no error in the implementing code and to show there is no substantial overshoot and/or ringing. This step is necessary because the temperature data is not WGN and contains large short-term excursions which will ring the filter if it is underdamped.
4) I think I saw somewhere that Greg had a high-pass version of the filter. Set the cut-off of the HP version as high as possible while still passing the 65 year cycle unattenuated. Subtract the HP filter output from the input data after lagging the input data by the (N-1)/2 samples (the filter delay) where N is the number of taps (coefficients) in the filter (N should be odd). The result will be the trend, which I suspect will show a remarkably low variance about a ~.5 degC/century regression.