Image Credit: Climate Data Blog
By Richard Linsley Hood – Edited by Just The Facts
The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.
Before we get too far into this, let me explain some of the terminology that will be used in this article:
—————-
Filter:
“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.
Gaussian Filter:
A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.
Full Kernel Filter:
Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.
Low Pass Filter:
A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.
Cascaded:
A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.
Triple Running Mean:
The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.
—————
With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.
Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:
Fig 1: UAH data with an Annual CTRM filter
Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.
The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:
Fig 2: HadCRUT4 data with an Annual CTRM filter
Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.
There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.
The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.
When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.
The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:
Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.
Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.
Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.
And that Full Kernel problem? We can add a Savitzky-Golay filter to the set, which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).
Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends
We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…
And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.
Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.
About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.





Steven Mosher says:
March 16, 2014 at 7:03 pm
“Richard. They all use different stations”
As I said, they are all drawn from the same set of thermometers. The selection differs, set to set. When looking at the oldest ones at greater than 150 years or so, the choices available are fairly restricted though.
I have comparisons between the various offerings. They do differ and in some surprising ways. Lots of work still to do 🙂
Bart says:
March 16, 2014 at 7:00 pm
I was pointing out that whilst there may be a superficial match, it fails when comparing to earlier data so therefore may be suspect on those grounds alone.
In any case, unless the temperature starts rising quite quickly fairly soon, the CO2 case gets weaker by the month.
RichardLH said in part March 16, 2014 at 6:38 pm
“Too many animations of images in the work I have done elsewhere I suppose, it just looks like a caterpillar to me 🙂 Sorry.
http://en.wikipedia.org/wiki/File:Lissage_sg3_anim.gif”
Richard – Thanks yet again. That helps. Glad we are LTI.
But you are completely wrong (misled by that animation I think) about whipping. None of those yellow polynomials in the animation you posted whipping about are ever even computed. Instead the SG fit is a FIXED FIR filter. Now, to be honest, the first time I learned this, the instructor’s words were exactly: “Astoundingly, this is an FIR filter.” I WAS astounded – still am. No whipping.
[ Aside: In general, polynomials as models of signals are going to be unsuited. Signals are mainly horizontal, while polynomials are vertically in a hurry to get to + or – infinity. Your whipping. ]
All you need do is populate a non-square matrix with integers (like -5 to +5 for length 11), raise these integers to successive powers for the rows below, take the least-squares (pseudo) inverse for the non-square matrix, and use the bottom row as the impulse response. (It is basically “sinc like”.) That’s the nuts and bolts – no polynomial fitting, and always the same answer for a given length and order. See my previous app note link. Yea – magic!
So it is no more complicated (once rather easily computed) than Gaussian. And it’s a much better filter.
Does anybody know what the result will look like when the “correct” smoothing method is used?
Is this better than piecewise polynomial regression or smoothing by projection on b-splines with knots estimated from the data (Bayesian adaptive regression splines as used by R. E. Kass with neuronal data analysis)?
It might be a useful pedagogical exercise to have a booklet of graphs, with each of 4 or more temperature series smoothed by many different methods, not excluding polynomials and trig polynomials (with fixed and estimated periods) of up to order 10..
RichardLH
Pulsed input is also integrated by matter as well so I will still differ.
>>>>>>>>>>>>>>>.
It most certainly is not. The tropics are net absorbers of energy, the arctic regions are net emitters. Massive amounts of energy are moved from tropics to arctic regions by means of air and ocean currents, and these are not integrated by matter as they would be if the sole mechanism was conductance. You’ve got an interesting way of looking at the data, just apply it to data that is meaningful.
A beautiful example of frequency content that I expect to be found in millennial scale uncut temperature records is found in Lui-2011 Fig. 2. In China there are no hocky sticks The grey area on the left of the Fig. 2 chart is the area of low frequency, the climate signal. In the Lui study, a lot of the power is in that grey area.
Copied from a post at ClimateAudit Oct. 31, 2011 Best Menne Slices
Lui sees a minor 60 (or 70) year cycle. But it seems less significant that the others.
Matthew R Marler said in part: March 16, 2014 at 7:23 pm
“Does anybody know what the result will look like when the “correct” smoothing method is used?”
No. Nobody knows! Unless you have a physical argument of other evidence, curve fitting is near useless. And the physics – let’s call it “incomplete”. Again, data fit to a curve is not THE data. Curve fitting destroys information. You can always get the proposed curve (again), but you can’t go back.
Many years ago I did a physics lab experiment and plotted data on every possible type of graph paper I could find in the campus store. My professor (Herbert Mahr) was kind enough to compliment my artistic efforts, but then assured me that none of it necessarily MEANT ANYTHING. Indeed.
meanwhile, the MSM/CAGW crowd are sourcing this today:
16 March: Phys.org: Greenland implicated further in sea-level rise
An international team of scientists has discovered that the last remaining stable portion of the Greenland ice sheet is stable no more.
The finding, which will likely boost estimates of expected global sea level rise in the future, appears in the March 16 issue of the journal Nature Climate Change…
“Northeast Greenland is very cold. It used to be considered the last stable part of the Greenland ice sheet,” explained GNET lead investigator Michael Bevis of The Ohio State University. “This study shows that ice loss in the northeast is now accelerating. So, now it seems that all of the margins of the Greenland ice sheet are unstable.”
Historically, Zachariae drained slowly, since it had to fight its way through a bay choked with floating ice debris. Now that the ice is retreating, the ice barrier in the bay is reduced, allowing the glacier to speed up—and draw down the ice mass from the entire basin…
http://phys.org/news/2014-03-greenland-implicated-sea-level.html
Sorry guys but this is absolutely worthless. There’s no data-set worth an iota pre-satellite era and post-satellite era lacks enough data to be useful. Even if there were a data-set that was accurate back to circa 1850 that’s still an incredibly short snippet of time wrt Milankovitch Cycle lengths and even the supposed Bond Cycle lengths.
John West says:
March 16, 2014 at 8:17 pm
Sorry guys but this is absolutely worthless.
In that case, how would you prove that the warming that is occurring now is not catastrophic?
NOAA and Santer suggested straight lines should be used. Perhaps curves should be used, but it may be harder to define when models are useless in terms of sine waves.
jai mitchell says:
March 16, 2014 at 2:39 pm
. . .hilarious. . .
—————————————————-
You quote skepticalscience, and you laugh at other people?
Now that is hilarious.
Richard, does the fact that months are of unequal length screw the plot.
bones says:
March 16, 2014 at 6:12 pm
“Last time I looked, the atmospheric CO2 concentration was increasing at about 5% per decade, which gives it a doubling time of about 140 years.”
Bones, recall we are looking at the doubling time of the anthropogenic contribution. One must subtract the constant value of preindustrial times, as I mentioned. The best-fit exponential says that value is about 256.7 ppm. It has been more than 42 years since the first Mauna Loa measurements in March of 1958 so we should see the doubling occurring around the year 2000. In fact that first observation in 1958 was about 314.4 ppm–subtracting 256.7 we get 57.7 ppm added to the preindustrial. When that hits 115.4 we have our doubling. That happened in December 2001, when the observed value was 372.2 ppm. (subtracting 115.4 gets us back to the preindustrial level.) Ergo, observed doubling time is 43 years, pretty close to the calculated 42.
If you want to know the time until the preindustrial concentration is doubled, extending this present best-ft exponential into the future shows it crossing a value of 512 ppm CO2 in 2050. Some say the preindustrial concentration was 280 ppm, so the doubling to 560 ppm occurs in 2061. Note this is not a prediction, just a report on what the present best-fit 3-parameter exponential through Sept 2013 shows. On the other hand,considering the track record of this fit, (<1% error for some 650 consecutive months), I wouldn't bet against it.
Hello Richard,
I have had an opportunity to look at your spreadsheet, btw, thanks for sharing it. I have some observations and questions about the methods used.
1. You use three non causal cascaded FIR filters, with 8, 10, and 12 taps respectively.
2. These are just plain averaged filters without any weighting.
3. While doing this will provide you a pseudo Gaussian response, why not use some weighting parameters for a Gaussian response?
4. By using even number taps you are weighting the averages to one side or the other in the time series, and this will affect your phase relationship.
5. Were any windowing corrections considered?
6. Were there any Fourier analysis performed on the data before and after filtering?
I use filters on data all of the time in my work, and am always concerned when someone does filtering and fails to mention Fco, Order, and Filtering Function (i.e. Bessel, Butterworth, Gaussian, etc).
Thanks again,
Richard
jai mitchell says:
March 16, 2014 at 2:39 pm
“Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded.”
. . .hilarious. . .
——————————————————————————————————–
Even more hilarious is if you do the calcs for the volume of ocean down to the depth quoted then the graph you quote represents only a few hundredths of 1 degree Centigrade rise in temperature. i.e. 2/5ths of bugger-all. And there is no evidence that the CO2 bogey is responsible.
RichardLH
Regarding our Savitzky-Golay Discussion: Here is a less cryptic but just single page outline of the SG.m program (first 20 trivial lines at most!) , offered with the Matlab pinv function, and without it, making it possible with any math program that will invert an ordinary square matrix. The example is order three (order 3 polynomial), length seven (7 points moving through smoother). The URL of the app note is on the jpg and in the thread above.
http://electronotes.netfirms.com/SGExample.jpg
Bernie
In examining share (stock) prices I use a multiple moving average tool (in the Metastock program). I have given the short term moving averages (less than 30 days) a different color to the long term moving averages (30 to 60 days). When the long term and short term cross to is a signal of a change which can be used as a buy or sell (with other signals). Have you considered other time periods for your filter and putting them on the same graph. maybe that will point to the unwarranted adjustments made to data or a significant natural shift which many say occurred around 1975 (at least in the Pacific & nothing to do with CO2)
Talking of CO2, there were measurements going back to the early 1800’s see here http://www.biomind.de/realCO2/realCO2-1.htm . There many be questions about accuracy but results show variations and throw doubt on the ASSUMED constant CO2 levels prior to 1966.
On the Y axis: CO2 concentration
On the X axis: temperature over time
X and Y are not necessarily connected. Unless funding depends on a shallow statement in your research paper conclusion to allow this.
wbrozek says in part March 16, 2014 at 8:33 pm
“In that case, how would you prove that the warming that is occurring now is not catastrophic?”
You ask a good question and since no one else is apparently still up, I will give it a shot.
In physics, particularly with regard to chaotic non-linear systems (like the climate), but still constrained, there is really no such thing as “proof”. There is instead evidence. The various so-call “global temperature” series, although surely imperfect, are evidence (not proof) that the catastrophic “hockey stick” warming is likely wrong.
The closest thing to establishing the truth is almost certainly the 2nd Law of Thermodynamics. Essentially heat moves from where there is more to where there is less. If the path of flow is not obviously in place, Nature is OBLIGED to provide it. In consequence, certain negative feedback thermostatting mechanisms are mandatory. No engineer or HVAC technician required. The great physicist (is there any other kind of physicist) Eddington famously and amusingly said, often quoted here I believe:
“If someone points out to you that your pet theory of the universe is in disagreement with Maxwell’s equations—then so much the worse for Maxwell’s equations. If it is found to be contradicted by observation—well these experimentalists do bungle things sometimes. But if your theory is found to be against the second law of thermodynamics I can give you no hope; there is nothing for it but to collapse in deepest humiliation.”
Any folks we know!
First, Richard, thanks for your work. Also, kudos for the R code, helped immensely.
My first question regarding the filter is … why a new filter? What defect in the existing filters are you trying to solve?
Now, be aware I have no problem with seeking better methods. For example, I use my own method for dealing with the end points problem, as I discussed here.
However, you say:
Mmmm … if that’s the only advantage, I’d be hesitant. I haven’t run the numbers but it sounds like for all practical purposes they would be about identical if you choose the width of the gaussian filter to match … hang on … OK, here’s a look at your filter versus a gaussian filter:


As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?
Next, you go on to show the following graphic and comment:
There is indeed a “wiggle” in the data, which incidentally is a great word to describe the curve. It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase. Let me show you why, using your data:

The blue dashed vertical lines show the troughs of the wiggles. The red dashed vertical lines show the peaks of the wiggles.
As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. Look at the variation in the lengths of the rising parts of the wiggle—18 years, 40 years, and 41 years. The same is true of the falling parts of the wiggle. They are 29 years in one case and 19 years in the other. Nothing even resembling regular.
The problem with nature is that you’ll have what looks like a regular cycle … but then at some point, it fades out and is replaced by some other cycle.
To sum up, you are correct that “what you can’t do is say the wriggle is not there”. It is there.
However it is not a cycle. It is a wiggle, from which we can conclude … well … nothing, particularly about the future.
Best regards,
w.
Bart : http://s1136.photobucket.com/user/Bartemis/media/CO2GISS.jpg.html?sort=3&o=0
That is very interesting.The short term (decadal) correlation of d/dt(CO2) to SST has been discussed quite at bit and is irrefutable. The ice core records seem to show direct CO2 vs temp correlation. So the question is where in between does it change? There will be a sliding mix as the response changes from one extreme to the other.
If earlier data diverges it may be the first indication of the centennial relationship or equally likely an indication of spurious data adjustments. That would be worth investigating.
Do you have a similar graph that goes back to say 1850?
Willis: “As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”
It is not indistinguishable , as Richard correctly says it totally removes an annual signal. You attempt to show this is “indistiguishable” by using them to filter temperature “anomaly” data that has already had most of the annual signal removed.
Now do the same with actual temps and you will see the difference. The advantage of having a filter than can fully remove the huge annual cycle is that you don’t need to mess about with “anomaly” data which themselves leak , possibly inverted 12mo signal as soon as the annual cycle changes from that of the reference period.
The one defect I see with triple RM filters is that they are a three pole filter and thus start to attenuate the signal you want to keep ( gaussian being similar has the same defect).
SG has a nice flat pass band, as does the lanczos filter that I also provided code and graphs for in my article.
Perhaps Richard could convert the lanczos into R code too 😉
Here is an example of the lanczos filter used on daily satellite TLS data, to remove annual signal.
http://climategrog.wordpress.com/?attachment_id=750
Comparing Lanczos to two (very similar) variations of the triple running mean : http://climategrog.wordpress.com/?attachment_id=659
Willis: “As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. ”
There is a long term change in there too Willis. Cooling at end of 19th c , warming since beginning of 20th. When you add a slope to a pure cosine you will find that it shifts the peaks and troughs, when you have differing slopes behind it they will get moved back and forth. That is the cause of the irregularity of the intervals you have noted.
Once again, you display your lack of understanding and over-confidence in your own abilities, then start telling others the way it is. You, like our host, seem to have adopted “the science is settled” attitude to cycles. Just remember that science is never settled and keep an open mind.
It is true that there is no guarantee that this will continue but so for it seems to fit quite well to the “pause” now becoming a down turn since 2005.