Crowdsourcing A Full Kernel Cascaded Triple Running Mean Low Pass Filter, No Seriously…

Fig 4-HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters

Image Credit: Climate Data Blog

By Richard Linsley Hood  – Edited by Just The Facts

The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.

Before we get too far into this, let me explain some of the terminology that will be used in this article:

—————-

Filter:

“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.

Gaussian Filter:

A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.

Full Kernel Filter:

Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.

Low Pass Filter:

A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.

Cascaded:

A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.

Triple Running Mean:

The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.

—————

With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.

Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:

Fig 1-Feb UAH Monthly Global Anomalies with CTRM Annual low pass filter

Fig 1: UAH data with an Annual CTRM filter

Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.

The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:

Fig 2-Jan HadCrut4 Monthly Anomalies with CTRM Annual low pass filter

Fig 2: HadCRUT4 data with an Annual CTRM filter

Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.

There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.

The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.

When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.

The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:

Fig 3-Jan HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters

Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.

Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.

Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.

And that Full Kernel problem? We can add a Savitzky-Golay filter to the set,  which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).

Fig 4-Jan HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters and S-G

Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends

We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…

And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.

Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.

About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
355 Comments
Inline Feedbacks
View all comments
RichardLH
March 17, 2014 5:57 pm

Bernie Hutchins says:
March 17, 2014 at 5:52 pm
“Yes – let’s stop being silly. It’s not lost because it WAS lost but you bring it back when you subtract.”
If you subtract the output (low pass) from the input stream (this is digital after all so that CAN be done with no phase delay as would be the case if this was analogue) then you end up with the high pass output instead. Simple maths.
The sum of the parts will always be the whole (rounding errors, etc. excepted).

RichardLH
March 17, 2014 5:59 pm

cd says:
March 17, 2014 at 5:56 pm
“OK. And again I ask you:
1) Apply filter to image.”
Stop right there. Take that filtered output image and subtract it from the original. You get the ‘negative’ filter to what you just applied.
In this case we have, low pass – high pass. No losses.

cd
March 17, 2014 6:03 pm

RichardLH
Again you answer another question. For all the huff and puff from Greg and yourself, Willis’ pertinent point (though poorly worded), has not been answered by either of you because for all your waffle (and while I think it may even be immaterial to Richards original point) you have backed yourselves into a corner on a sideline issue.
As I say, I feel I have been taken in by your self-asserted expertise.

RichardLH
March 17, 2014 6:05 pm

cd says:
March 17, 2014 at 6:03 pm
“As I say, I feel I have been taken in by your self-asserted expertise.”
I am sorry if I have disappointed. Please just tell me where you think the losses are going in the 15 year corner Gaussian and I will do my best to explain further.

Bernie Hutchins
March 17, 2014 6:05 pm

RichardLH says:
March 17, 2014 at 5:57 pm
Perhaps – to be constructive here – why not consider presenting your viewpoint in terms of “Perfect Reconstruction Filters” (related to “Sub-band Coders” and wavelets), and not up against the traditional jargon of “filtering”. That would solve everything.

cd
March 17, 2014 6:10 pm

RichardLH
Take that filtered output image and subtract it from the original. You get the ‘negative’ filter to what you just applied.
That is not what Willis asked you.
Again you answer a different question one could assign a best fit plane say to a grey scale image (something you hate) to give you a colour gradient image, and yes by adding the residual between the original image to the gradient image you can recreate the original image. But that does not prove that the gradient image has not lost information, rather the opposite in that you need the original image to get back to the original image. And yes the same is true for processed image using a convolution filter.

RichardLH
March 17, 2014 6:13 pm

cd says:
March 17, 2014 at 6:10 pm
“That is not what Willis asked you. ”
Remind me again just exactly what it was you think Willis asked that I have not answered to your satisfaction.
I have been playing with images, filters, compressions and the like for way too long I suspect. 🙂

RichardLH
March 17, 2014 6:18 pm

Bernie Hutchins says:
March 17, 2014 at 6:05 pm
“Perhaps – to be constructive here – why not consider presenting your viewpoint in terms of “Perfect Reconstruction Filters” (related to “Sub-band Coders” and wavelets), and not up against the traditional jargon of “filtering”. That would solve everything.”
!!! You think that would be simpler?
I just don’t get how people can claim that filters lose information then, when asked how and where the loses go if they do that, fail to come up with any explanation as to where and why. Just a bald statement of ‘fact’.
An assertion, nothing more.

cd
March 17, 2014 6:21 pm

RichardLH
Look I actually like your points on climate variability. But Willis’ point regarding loss of “data” (I think he meant information) after convolution stands, and a sharp one too.
In my opinion you have got involved in the wrong argument here and you’re wrong. Side tracked but that is in the nature of blogs. Anyways, goodnight.

RichardLH
March 17, 2014 6:28 pm

cd says:
March 17, 2014 at 6:21 pm
“Look I actually like your points on climate variability. But Willis’ point regarding loss of “data” (I think he meant information) after convolution stands, and a sharp one too.”
Thanks. Actually he was wrong. For a simple band pass splitter circuit in digital there are no losses as such, just possible incorrect band assignment.
It is a binary chop. Either low or high. Nowhere else to go. The parts always sum to the whole. You can choose which part to look at (or both if you require).
It is all always there.
Night.

Bart
March 17, 2014 6:31 pm

Willis Eschenbach says:
March 17, 2014 at 4:50 pm
“The idea that we can squint at a smoothed temperature record and say what has caused the swings, however, is a bridge far too far.”
If you need to squint to see the ~60 year signal, it’s probably time to get a new lens prescription.
It also isn’t as important to say what caused it as to say what assuredly did not. Rising CO2 did not. It started before the rise, and it looks nothing like it.

March 17, 2014 6:36 pm

cd: “That is the same principle of a B-Spline, and is based in the prescribed level of the spline.”
No. There is no curve fitting or basis functions involved in a discrete filter. Convolution in the time domain is mathematically equivalent to multiplication in the frequency domain. The filter’s frequency response, (ideally for a LP, the response = 1 in the pass-band and 0 for frequencies above) is multiplied by the signal’s spectrum, retaining the spectrum in the pass-band and rejection that part of the signal’s spectrum in the stop-band. A B-spline, on the other hand merges polynomial fits at the segment boundary points (the so-called knots) and thus unlike a filter, has no definable, data-independent frequency response. That’s the key point, a filter’s frequency response is invariant and independent of the data and the technique is unrelated to curve fitting algorithms.
All that being said, the point about information loss assumes the spectrum of the information of interest is confined to the pass-band of the filter. If it is not, information is surely lost. In Richard’s plot’s for instance, all information about annual variation is lost. If I’m listening to a violin concerto on my stereo and turn down the treble knob, I can no longer hear the high notes as well and again, information is lost.
Assuming the signal is spectrally localizable, it is up to the user to decide what defines the “signal” of interest and to design a filter appropriate to the task, It should be pointed out that residual (the part of the input signal rejected by the LPF) by definition attenuates long term-trend information and thus all of this information, which is what we are after, is contained in the output. That is, I think, what Greg and Riochard mean by “nothing is lost”, although that’s not how I would phrase it.
Regards,
Jeff

Bernie Hutchins
March 17, 2014 6:41 pm

RichardLH said March 17, 2014 at 6:18 pm in part:
” !!! You think that would be simpler? ”
Not if you and/or Greg do not understand EXACTLY what I am suggesting and that this already pretty much what you are using to defend your point of view: putting all the information in two or more bands that can then be perfectly reconstructed. You seem to be trying to reinvent this notion. It’s been done, and it’s easy in concept. Even simpler! The simplest PR filters are in fact, a length-2 moving average and a length-2 differencer. Imperfect filters, but no information lost – as you want. No one will argue information loss if this is your angle. Because no information IS lost.

ThinkingScientist
March 17, 2014 6:59 pm

To RichardLH
Just to let you let you know, I think you have been extraordinarily patient with some of the critics here. As a geophysicist myself, with more than a passing understanding of low and high pass filtering (and as an audophile and user of PA equipment including subwoofer design) I have no problem in understanding the utility of the simple (but clever) filter you have presented or the significance of the low pass/high pass threshold being a non-ciritcal choice over a wide range of frequencies. Makes sense to me.
Some of the criticisms on this web page from people with apparently high IQ’s are quite strange. Its as though they want to pick a fight with what seems to me a simple and pragmatic way of implementing a low pass/high pass splitter filter with no significant side effects. In a spreadhsheet. With jut 3 columns.
Simples.

cd
March 17, 2014 7:05 pm

Jeff Patterson
No. There is no curve fitting or basis functions involved in a discrete filter.
Perhaps I’m missing it, but I don’t think he is using a discrete filter in the sense of the sample window.
Convolution in the time domain is mathematically equivalent to multiplication in the frequency domain.
Thanks for that, I’d assumed we all knew that. Although binary filters such as mode/median kernel operations have no equivalent. And please don’t involve wavelet transforms at this stage!
All that being said, the point about information loss assumes the spectrum of the information of interest is confined to the pass-band of the filter.
Well there’s the money statement!
I think we’re all in agreement and WIllis is right on the point of loss of “data”. Richard and Greg knew exactly what he meant (loss of information) but thought by admission on being wrong on one point meant all was wrong. Bit like Richard Dawkins, who would have anyone who challenges established Darwinian theory as being some sort of creationist (akin to denier). On the side of science, APPARENTLY, he proclaims there shall be no debate! How enlightened these experts are.

cd
March 17, 2014 7:06 pm

Sorry omission rather than admission.

cd
March 17, 2014 7:13 pm

RichardLH
For a simple band pass splitter circuit in digital there are no losses as such, just possible incorrect band assignment.
Again, without all the gobbledigook (obfuscation) can you reconstruct the original signal from the post-processed series + the “inverse” convolution (what’s that you say, there is no inverse!).
So you can’t. OK then information is lost! And Willis was right!

Editor
March 17, 2014 7:32 pm

Greg Goodman says:
March 17, 2014 at 3:28 pm

Willis:

“So even ordinary averaging is not uniquely invertible, and thus it loses information. The same is true for the gaussian average, or a boxcar average, or most filters.”

That is true for straight average and RM since all the weights are equal. It is not the case for a gaussian or other kernel were the weights are unique.
The reverse process is called deconvolution. It’s basically a division in the frequency domain in the same way a convolution is multiplication in the freq domain. For example you can ‘soften’ a digital image with a gaussian ‘blur’ ( a 2-D gaussian low-pass filter ). You can then ‘unblur’ with a deconvolution. There are some artefacts due to calculation errors but in essence it is reversible.

Well, I’ll be the third person to strongly question this claim. To start with, whether the weights are unique is immaterial to this question, because the weights in a gaussian average are symmetrical and thus decidedly not unique …
The problem, however, is not the uniqueness of the weights. It is that at the end of the day we have more variables than equations. It is inherent in the convolution process. For each step, we add one value to the front end of the filter, and remove one from the other end.
As a result, we can definitely improve the resolution as you point out regarding the Hubble telescope. But (as someone pointed out above) if you are handed a photo which has a gaussian blur, you can never reconstruct the original image exactly.
You can test this mentally by considering a gaussian blur of a black and white image at successively larger and larger scales. The image gets more and more blurry as you increase the size of the gaussian filter, until eventually it is just fluctuations in some middle gray.
IF your theory were correct, then knowing the convolution weights you could take that sheet of fluctuations in gray and reconstruct the original photo … and even you must admit that’s not possible whether or not you know the weights of the convolution.
In other words, you sound very persuasive and knowledgeable, but you are very wrong. Here’s a simple example:

The green columns and the purple columns are the two datasets, Data A and Data B. Each dataset has three points (N=3), which are quite different.
In each case I used the identical filter, a 13-point = ± 3 standard deviations gaussian filter. The red line is the filtered version of Data B. The blue line is the filtered version of Data A. What blue line, you say? The one you can’t see because it’s hidden by the red line …
Now, can you deconvolve that? Sure … but as you can see, gaussian averages that are indistinguishable from each other lead to widely differing answers as to the underlying data.
Now that doesn’t mean all is lost. It’s not. While there are a variety of datasets that would have nearly indistinguishable gaussian averages, it is a finite number, and a finite number of results. And those results (possible data configurations) have a distribution. That means you can take an educated guess at the underlying data, and you’ll do better than flipping coins.
It doesn’t mean that knowing the weights you can reconstruct the original data. Not possible.
w.

March 17, 2014 8:32 pm

The issue of filter inversion (not that it’s germane) need not be argued. Every filter is uniquely defined by it’s Z-transform. In the z-domain, Y[z] = C[z]*X[z] where x[n] , y[n] are the input and output signals respectively and c[n] the impulse response of the filter. Since X[z] = Y[z] C^-1[z], the input signal can be recovered exactly from y[n] by the inverse filter C^-1[z] iff C^-1[z] is causal and stable. Some filters are invertible, some aren’t. To be invertible, the ROC of C[z] and C^-1[z] must overlap which is not always the case.

March 17, 2014 8:35 pm

cd says:
Perhaps I’m missing it, but I don’t think he is using a discrete filter in the sense of the sample window.
I see nothing in the description that would lead me to believe that the filter is not a simple FIR. What leads you to the counter-conclusion?

March 17, 2014 8:57 pm

Bart says:
March 17, 2014 at 6:31 pm
Willis Eschenbach says:
March 17, 2014 at 4:50 pm
“The idea that we can squint at a smoothed temperature record and say what has caused the swings, however, is a bridge far too far.”
Bart: It also isn’t as important to say what caused it as to say what assuredly did not. Rising CO2 did not. It started before the rise, and it looks nothing like it.
This is the point that invariably gets lost in these my IQ-is-higher-than-your-IQ slug fests that break out anytime someone posts on the subject of quasi-periodicity the observational data. Whether the “65 year cycle” is a near line spectrum of astronomical origin, a resonance in the climate dynamics, a strange attractor of an emergent, chaotic phenomena or a statistical anomaly, what it isn’t is man-made. It is however, responsible for the global warming scare. If it had not increased the rate of change of temperature during its maximum slope period of 1980’s and 90s, the temperature record would be unremarkable (see e.g. the cycle that changed history and we wouldn’t be here arguing the banalities of digital filter design.

Bart
March 18, 2014 12:01 am

Jeff Patterson says:
March 17, 2014 at 8:57 pm
Bingo!

Bart
March 18, 2014 12:16 am

Though, I take issue with your link where it says: “natural, emergent cycles simply do not exhibit this type of stable behavior”. They do all the time. It is simply a matter of having energy storage with a high “Q” response.

Bart
March 18, 2014 12:29 am

Your critic was amusing. Seems to think that once someone conjectures a potential cause for something, however thinly supported, he is then entitled to assume it is the cause, and the question yields to his begging it.

Matthew R Marler
March 18, 2014 12:32 am

Richard LH: Curve fitting is deciding a function and then applying it to the data.
That may be your definition, but it is certainly idiosyncratic. Most curve fitting entails choosing lots of functions that may be fit to all or part of the data, as with Bayesian adaptive regression splines. The result is then called the “fitted curve” , or the “fitted curves” when many are compared.

1 9 10 11 12 13 15