Guest essay by M.S.Hodgart
(Visiting Reader Surrey Space Centre University of Surrey)
The figure presented here is a new graph of the story of global warming – and cooling. The graph makes no predictions and should be used only to see what has been happening historically.
The boxed points in the figure are the ‘raw data’ – the annualised global average surface temperature known as HadCRUT4 as released by the UK Meteorological Office. Strictly these are ‘temperature anomalies’. The plot runs from 1870 up to the last complete calendar year 2012. The raw data cannot of course be treated as absolutely true – but let us give the Met Office the benefit of the doubt – this is hopefully their best effort so far.
It is a difficult statistical problem to estimate the historical trend in these kind of time series. The solution requires some kind of smoothing of the data but how exactly? There are an unlimited number of ways of drawing some curve through the data.
A popular method – much used in the climate science literature – is by a moving average. One trouble with it is that quite different looking curves obtain depending on the width of the smoothing window used in that average – also on the choice of window. Another difficulty is its poor dynamic tracking capability
The other popular method is to fit a selection of straight lines (least square estimate) to a selected span of years. The notorious difficulty here is the quite different impression one gets depending on the choice of start and stop years.
The difficulty is finding for a best estimate – some curve which is most likely to be closest to the truth. There is an outstanding problem in what the statistical literature identifies as model selection.
The source of the problem is what the telecommunication and control engineers call noise in the data – a random-looking variation from one year to the next.
As a conspicuous example of this random variation: in recent years, according to the record, the global temperature (anomaly) was 0.18 deg in 1996 ; had jumped to 0.52 deg in 1998 but had fallen again to 0.29 deg by 2000.
Respecting normal linguistic usage and common-sense we would not want to describe a jump of 0.34 deg in only two years as a phenomenon of ‘global warming’; nor a drop by 0.23 deg over the next two years as ‘global cooling’. Ordinary language, when expressed in mathematics, envisages some smooth slow-varying curve which passes on a middle course through the scattered data, ignoring these rapid changes, but responsive over a longer term to general movement . There needs to be an explicit decomposition
HadCRUT4 annual data = trend in the data + temperature noise
The problem is to estimate that trend in the data when it is corrupted by the presence of this significant noise.
HadCRUT4 global annual averaged temperature anomaly 1870 – 2012 (connected brown box points). Brown curve 26-year span cubic loess estimate. Dashed brown curve 10th degree PR estimate. Red curve is a mean trend. Blue curve is the offset cyclic component of loess. The red circled points identify coincident years of trend and mean trend: in years 1870, 1891, 1927, 1959, 1992, & 2012. Blue circled points delineate alternating cooling and warming in cyclic variation: 1877, 1911, 1943, 1976, & 2005.
A novel principle of joint estimation is proposed here – using two relatively simple methods of smoothing.
In the figure the continuous brown curve is an estimate by locally weighted regression (loess) – using a locally-fitting cubic polynomial and the standard ‘tri-cube’ weighting. Loess is greatly superior generalisation of the moving average[1] . Professor Mills deserves credit for first pointing out the superiority of a cubic over the usual linear or quadratic local polynomial [2]. Unfortunately the standard statistical tools seem not to have caught up with him here – nor with his ‘natural’ solution to the end-point problem (where data runs out after 2012 and before 1870 on this graph.
The dashed brown curve is a standard (unweighted) polynomial regression. The principle of joint estimation is to look for span of years in loess and a degree in the polynomial regression where the two curves most closely resemble each other. There is a least disparity
Empirical search finds for a span of 26 years for the loess and a 10th degree for the polynomial. No other combination of loess span and polynomial degree gives such a close agreement. The condition is unique and therefore automatically solves the problem of model selection
In the author’s view this joint estimate is really the best that can be done in finding for the trend of global surface temperature. For various reasons the loess estimate should be prioritised.
The optimal estimate identifies alternating cooling and warming intervals from 1877 to 2005. Two cooling intervals alternated with two warming intervals. These two cycles of alternating cooling and warming were barely conceded, and certainly not discussed let alone explained, in the influential IPCC 4th report (AR4) published in 2007 and based on data available to 2005.
But this property conflicts with a different requirement: that a trend should be a “smooth broad movement non-oscillatory in nature” (see 1.22 in Kendall and Ord’s classic text [3] ). To reconcile these different requirements the estimated trend must be further decomposed into a non-oscillatory mean trend (red curve) and a quasi-periodic oscillation (blue curve).
trend in the data = mean trend in the data + quasi-periodic oscillation
A unique decomposition is achieved by computer-assisted iterative adjustment of four intersecting common years (red circled points). The mean trend is a cubic spline interpolation which deviates least from a straight line while the oscillatory component has a zero average over the record.
The strong oscillating component – the blue curve – is seen to be contributing more than half of the rate of increase when global warming was at a peak in the early 1990s.
What goes up may come down. This oscillating component looks to be continuing. Assessment is increasingly uncertain the closer one gets to the last data year of 2012. But despite this difficulty the probability that there is again global cooling in recent years can be stated with high confidence (IPCC terminology – better than 80%).
In the author’s view the whole climate debate has been muddled – and continues to be muddled – by not differentiating between this trend in the data (which oscillates) and the mean trend (which does not).
So yes – global warming looks to have stopped (if you believe in HadCRUT4) when one defines global surface temperature in terms of that trend – the brown curve. In fact it has more than stopped – it looks very much to have gone into reverse.
But no – average global warming continues ever upwards (still believing in HadCRUT4) when one defines an average global surface temperature in terms of that mean trend – the red curve.
A non-ambiguous computation of the rate of temperature increase is achieved by working from those common years (red circled points) when the two estimates coincide. The increase for HadCRUT4 from 1870 to 2012 of 0.75 ± 0.24 deg is equivalent to an average rate of 0.053 ± 0.017 deg/decade. From 1959 to 2012 this average rate looks to have increased to 0.090 ± 0.034 deg/decade. The error limits here are the usual ± 2 standard deviations or 95% confidence limits)
If this faster trend were to continue then we would be looking at an average rise from now of 0.8 deg by the end of this century (not choosing to set controversial error limits into the future). It is not however safe to make any predictions on the basis of the plot and the methodology adopted here.
It should not need to be stressed that there is no contradiction between these results and finding that regional warming may be continuing – particularly in high Northern latitudes and the Arctic.
There is a great deal more than can and needs to be said to justify these results. Interested readers can apply to the author for longer treatments and in particular a full and detailed mathematical justification.
[1] “Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting” W. S. Cleveland, S.J. Devlin Journal of the American Statistical Association, Vol. 83, No. 403 (Sep., 1988), pp. 596- 610.
[2] “Modelling Current Trends in Northern Hemisphere temperatures” T.C.Mills International Journal of Climatology 26 p 867- 884 (2006)
[3] “Time Series” Kendall and Ord (1990) 3rd edition Edward Arnold
richardbriscoe says:
September 24, 2013 at 12:39 am
There’s a third component besides natural & human, ie a second human input, the fudging of data by “climate scientists”. Taking out that effect & the natural leaves very little room for net general human activity, which produces both cooling & warming. The net effect of human GHGs is vanishingly small, well within margin of error & no cause for concern. To the negligible extent that there might be man-made warming, it’s a good thing, as of course too is more CO2 in the air.
‘– but let us give the Met Office the benefit of the doubt –’
Fatal mistake.
We have no objective reason to give the Met Office, or the CRU with which it collaborated in producing HadCRUT4, the benefit of any doubts. On the contrary we have every reason to be suspicious of these government-sponsored agencies that are openly committed to CAGW-advocacy, especially since the Climategate revelations.
Trusting in the unverifiable word of these official data-manufacturers is like trusting the tailors who made the King’s New Clothes. It is a fundamentally unscientific act that constitutes our first basic step into the make-believe reality which they have concocted and which they totally control. From there on in it is just an endless, labyrinthine trip through their pre-calculated fantasy-world like Donald Duck’s wanderings in Mathmagicland.
This analysis is along the right line but ends up showing too much warming by the data choices. A better source of data would be hadsst3 which goes back to 1850. It shows the warming is closer to .6C over 160+ years and matches what has been seen in the satellite data more closely. With biased adjustments, siting problems and UHI there is no good global data that includes land data.
If done in this manner the future warming looks more like .05C/decade or .4C by the year 2100.
@James Baldwin Schrumpf 4:00 am
I don’t see how the natural variability of the temperature data can be called “noise.” ….
I think referring to the natural variability of data that has already been strenuously massaged to get to that one temperature point for the entire Earth as “noise” is a false concept. Rather than trying to get a smooth curve from it, we should step back and say, “Damn, that number moves around a LOT, doesn’t it?”
I agree with you in sprit, James. What is noise to some is signal to others. I’ve been scientifically amused by seismic interpretation over the past 35 years. What was noise and out of plane artifact is now stratigraphic information. “The rocks really do look like that!”.
But noise is real. And it should not be confused with error.
To go from the raw temperatures to the HADCRU4 passes through a lot of hands, each one adds error, seen as noise. To get to the anomally, the “keepers” (1) of the data do not subtract, but ADD corrections for what we THINK are KNOWN phenomena to better analyze an unknown residual. The adding of corrections ADDS ERROR (noise) equal to the difference of (what we THINK – what it really is). So what we are all studying is the
Anomaly = (Raw data = “local signal” + “local contamination bias and error” + recording error) + Corrections for what is known + error in corrections + bias in corrections.
This is all bad enough if the bias = 0. But the bias isn’t zero. Too many hands are on the data and in the till at the same time.
Note 1: “keepers” of the data – a term used with deliberate sarcastic irony.
See: An Open Letter to Dr. Phil Jones of the UEA CRU , Willis Eschenbach, WUWT 11/27/2011.
(another nomination to a Watts’ Best collection as well as “Climate Fail Files”)
@James Baldwin Schrumpf 4:00 am
I don’t see how the natural variability of the temperature data can be called “noise.” ….
I think referring to the natural variability of data that has already been strenuously massaged to get to that one temperature point for the entire Earth as “noise” is a false concept. Rather than trying to get a smooth curve from it, we should step back and say, “Damn, that number moves around a LOT, doesn’t it?”
I agree with you in sprit, James. What is noise to some is signal to others. I’ve been scientifically amused by seismic interpretation over the past 35 years. What was noise and out of plane artifact is now stratigraphic information. “The rocks really do look like that!”.
But noise is real. And so is error masquerading as noise.
To go from the raw temperatures to the HADCRU4 passes through a lot of hands, each one adds error, seen as noise. To get to the anomaly, the “keepers” (1) of the data do not subtract, but ADD corrections for what we THINK are KNOWN phenomena to better analyze an unknown residual. The adding of corrections ADDS ERROR (noise) equal to the difference of (what we THINK – what it really is). So what we are all studying is the
Anomaly = (Raw data = “local signal” + “local contamination bias and error” + recording error) + Corrections for what is known + error in corrections + bias in corrections.
This is all bad enough if the bias = 0. But the bias isn’t zero. Too many hands are on the data and in the till at the same time.
Note 1: “keepers” of the data – a term used with deliberate sarcastic irony.
See: An Open Letter to Dr. Phil Jones of the UEA CRU , Willis Eschenbach, WUWT 11/27/2011.
(another nomination to a Watts’ Best collection as well as “Climate Fail Files”)
[Mods: please delete my 9:50 am. This corrects an unclosed tag.)
The second derivative of the mean trend appears to be going negative after 1960 which is exactly the opposite to be expected from more CO2 per CAGW theory.
While I suppose it’s fun to break down temperature data into a sum of a variety of different curves – the curves have to at least correlate to ~something~ in reality to be of any value. For example, the oscillating component has a period of 65 years so what feature in our climate has a 65 year period?
If there isn’t one then this is nothing more than an exercise in advanced numerology. Just as you said: “There are an unlimited number of ways of drawing some curve through the data.” – there are an unlimited number of CURVES that will add up to that data.
Empirical search finds for a span of 26 years for the loess and a 10th degree for the polynomial. No other combination of loess span and polynomial degree gives such a close agreement. The condition is unique and therefore automatically solves the problem of model selection
In the author’s view this joint estimate is really the best that can be done in finding for the trend of global surface temperature. For various reasons the loess estimate should be prioritised.
I love curve-fitting, and of course I respect loess smoothing. This goes into the collection of models that will be stringently tested by the upcoming decades. As I wrote for Vaughan Pratt: if you know the main trend (he had a particular model for the trend), you can estimate the noise; if you know the noise, you can estimate the main trend. If both have to be estimated from the same time series data (with or without “optimal” smoothing), then you have “curve fitting”.
Can the components of the model, trend and residual oscillation, be related (linearly or polynomially) to measurements of known physical processes?
You keep saying “noise” when what you mean is DATA. Raw temperatures are not noise. Noise would be the non-temperature input to the data: human error, equipment inaccuracy and imprecision, changes in measurement technique. All of which clearly exist. However, treating the actual data as noise to be vanished away by statistical magic in favor of a mythical “global temperature anomaly” destroys any trace of credibility you may have had to start with. Averages are a unique mathematical contrivance that actually contains less information the more information you cram into them. Furthermore, creating a graph that has a 12 hundredths of a degree scale compared to a 142 year scale in order to produce an alarmingly sharp red upward trend instead of the very slow gentle rise actually indicated is hardly impartial. Report real temperatures with 100 degrees on the vertical for 100 years on the horizontal and your basis for alarm disappears.
“Surrey Space Centre University of Surrey”
The Surrey Space Centre. In Guildford.
Pure comedy gold.
With respect to curve fitting, I get an R^2 of 0.8785 regressing these data on a 250 year cycle with statistically significant harmonics at 125 years and 62.5 years. Interestingly, projecting into the future, we are in a peak now and can expect a gradual decline until around 2037, followed by a gradual rise for around 30 years, then a more rapid decline. This is not an CAGW prediction.
“But no – average global warming continues ever upwards (still believing in HadCRUT4) when one defines an average global surface temperature in terms of that mean trend – the red curve.”
The “average” is an artifact of the statistics. One cannot assume that there is a meaningful “average” temperature for Earth. If the problem were formulated in terms of joules, units of energy, then an average would make sense. There are nicely formulated laws for conservation of energy in all branches of physics. There are no laws for conservation of temperature in any branch of climate science. For example, we might all agree that ENSO can redistribute temperatures across the Pacific but that the short term effects of these redistributions must sum to zero over the long run because conservation of energy requires it. Yet all the warming from 1979 to 1998 could have been the result of redistributions of temperature from myriad local phenomena such as ENSO. To imagine that temperatures can be precisely modeled as energy can is to impose on nature a uniformity that might not exist.
It’s HADCRUT…the entire trend is manufactured by “adjustments”
Does anyone know how accurate the thermometers used for the CRU data are? Is the scale and readout sufficient to derive an accuracy to the tenth or hundreth degree?
I’d have preferred Hadcrut 3. ” In fact it has more than stopped – it looks very much to have gone into reverse.” A projection to 2030 or so would have been interesting, particularly given the imminent release of AR5.
So why do the GCMs all miss this ‘oscillation’?
fhHaynie No doubt you approve of the cooling forecast at 6:33 above and at
http://climatesense-norpag.blogspot.com
However I use the 60 and 1000 year solar cycles.
Dr. Page,
I didn’t select the primary wave length based on an expected physical forcing. The trial and error regression on this particular set of data to maximize R^2 gives me those results. You would need a much longer set of data to identify a 1000 year cycle. These observed cycles are most likely riding on progressively longer cycles and projecting outside 1.5 times the observed range is risky business. However, I agree with you that these cycles are likely to be associated with solar cycles.
Anyone familiar with Thom, Zeeman, Ilya Prigogine will know of “Butterfly,” “Cusp,” “Fold,” and “Swallowtail” catastrophes– abrupt transitions, “breaking points” wholly unanticipated by any measure of central tendency (average, mean, median,mode et al.). “Catastrophes” are not mere math/statistical artifacts, but fundamental natural processes. Citing “persistence fallacies” to enshrine long-term norms as permanent is dangerously misleading.
Curve fitting is only curve fitting, however good it might seem to be – “with four parameters I can fit an elephant, with five I can make his trunk wiggle”.
Science is making falsifiable hypotheses: could you please indicate what the hypothesis is here, and in what circumstances you consider it will be nullified?
TimC I’m not sure what the hypothesis is here .The hypothesis in my cooling forecast linked above is simple and clear. i.e. that the current cooling peak at about 2003 is a peak in both the 60 year and 1000 year quasi cycles,
It will be seriously in question if there is not about 0.15 – 0.2 degrees of cooling by 2018- 20
I want to see the coefficients of that 10th degree polynomial. I suspect that they will
either decrease rapidly in magnitude, or lead to a lot of nearly equal terms cancelling
each other out. In either case, there is no need (and, as Steven Mosher
points out, no physical motivation), to go to such a high degree.
However, I must admit it’s fun to see an analysis that includes the statement that:
“… From 1959 to 2012 this average rate looks to have increased to 0.090 ± 0.034 deg/decade. ”
Hmmm. Less than 1/2 of the claimed 0.2 deg/decade that the IPCC is claiming.
The sharpness of the downtrend is what is of most interest to me. I have kept similar tabs on HadCRUT3v but using an 11 year binomial smoothing and the same sharp downtrend is present over the past decade ( the peak being 2003 to 2005 depending on whether you look at the global data, the Sh or the NH). I also run the the three data sets ( GL, NH andSH) together plus the NH-SH difference as I think that gives a clearer overall picture. What is at issue here is getting the clearest possible picture of wjhat is happening. I cannot think of a dumber way to illustrate same than using a linear regression but why would I be surprised when the CAGW proponents include such narcissistic nongs as Pachauri, Hansen, Nuccatelli, Mann, Cook and Lewandowski. It is not science we are looking at but agitprop and marketing when nitwits like these are involved.
As others have pointed out, this analysis is useless. Curve-fitting is meaningless without any real-world mechanisms. The fact that there are a couple of visible cycles does not allow any forward projection, because the mechanisms behind the cycles are not known.
Consider this : Suppose that the analysis was of sunspots instead of global temperature, and that only a couple of cycles were available. How meaningful would the analysis be? The answer is – absolutely useless. Let’s face it, in reality we’ve had how many cycles already – 23? – and how well can we forecast the amplitude and duration of just the next cycle, let alone the next several cycles? We simply can’t do it, our skill level is absolute zero. The wildly wrong forecasts of cycle 24 prove that. Before cycle 24 started, some ‘experts’ expected it to be a big one. It turned out to be the smallest for yonks, much smaller than all forecasts.
So this temperature analysis is absolutely meaningless and a waste of time in all respects bar one: it does suggest that there are cyclical influences and that alone is sufficient to cast serious doubt on the totally cycle-free vision of the IPCC.
Good post. The author writes: “The raw data cannot of course be treated as absolutely true – but let us give the Met Office the benefit of the doubt – this is hopefully their best effort so far.”
The data has been fraudulently corrupted. Early temperatures have been depressed in order to create a spurious warming trend. Handwritten records from 1902 show 0.7C. Today’s electronic record shows that reading as -0.2C. You might as well interpret Tony Soprano’s tax return.
http://endisnighnot.blogspot.co.uk/2013/08/the-past-is-getting-colder.html
At the beginning of the post it said that this would be based on the ‘raw data’. I was hopeful that the author had somehow backed out a significant portion of the historical adjustments, but then i see it’s HadCRUT4 which was rather disappointing. I’ll only make passing mention to the fact that this is not ‘data’ at all, but a statistical construct.
Leaving that aside, and also leaving aside the issues with fitting a 10th order polynomial to such ‘data’ (lots of degrees of freedom…) what is becoming apparent to me is that there is a cyclical trend that can be linked to physical processes such as the PDO/AMO, as well as a long-term linear trend.
Given this, the way we can identify an anthropogenic signature in the temperature data is by attempting to identify the cyclical component as best we can, and then compare the long-term linear slope to the near-term linear slope on datasets with long enough temperature histories. If the alarmists are correct, then there should be a clear difference in linear trends over time and in fact we should be seeing an accelerating linear trend in the nearest-term data.
If the trend is less alarmist and more lukewarm then obviously this acceleration in trend would not appear, and we do not have a justification for spending trillions on Al Gore’s pet projects.
CET is an obvious candidate; are there other datasets which are of a long enough duration for such an analysis to be undertaken? Using single-site data may also help in identifying the cyclical component; if we find similar periodicity in data we separately analyse from different sites we can be more confident that the periodicity is genuine.