A quick look at temperature anomaly distributions

R code to look at changing temp distributions – follow up to Hansen/Sato/Ruedy

Story submitted by commenter Nullius in Verba

There has been a lot of commentary recently on the new Hansen op ed, with associated paper. Like many, I was unimpressed at the lack of context, and the selective presentation of statistics. To some extent it was just a restatement of what we all already knew – that according to the record the temperature has risen over the past 50 years – but with extreme value statistics picked out to give a more alarming impression.

But beyond that, it did remind me of an interesting question I’ve considered before but never previously followed up, which was to ask how the distribution had actually changed over time. We know the mean has gone up, but what about the spread? The upper and lower bounds?

So I plotted it out. And having done so, I thought it might be good to share the means to do so, in case anyone else felt motivated to take it further.

I’m not going to try to draw any grand conclusions, or comment further on Hansen. (There will be a few mild observations later.) This isn’t any sort of grand refutation. Other people will do that anyway. For the purposes of this discussion, the data is what it is. I don’t propose to take any of this too seriously.

I’ll also say that I make no guarantees that I’ve done this exactly right. I did it quickly, just for fun, and my code is certainly not as efficient or elegant as it could be. If anyone wants to offer improvements or corrections, feel free.

I picked the HadCRUT3 dataset to look at partly because it doesn’t use the extrapolation that GISSTEMP does, only showing temperatures in a gridcell if there are actual thermometers there. But it was also because I’d looked at it before and already knew how to read it!

For various reasons it’s still less than ideal. It still averages things up over a month and a 5×5 degree lat/long gridcell. That loses a lot of detail and narrows the variances. From the point of view of studying heatwaves, you can’t tell if it was 3 C warmer for a month or 12 C warmer for a week. So it clearly doesn’t answer the question, but we’ll have a look anyway.

Then I picked the interval from 1900-1950 to define a baseline distribution. I picked this particular interval as a compromise – because the quality of the earliest data is very questionable, there being very few thermometers in most parts of the world, and because the mainstream claims often refer only to the post-1950 period as being attributable to man.

Of course, you have the code, so if you don’t like that choice you can pick a different period.

The first plot shows just the distribution for each month. Time is along the x-axis, temperature anomaly up the y-axis, and darker shading is more probable.

(From ‘HadCRUT3 T-anom dist 5 small.png’)

Because the outliers fade into invisibility, here is a plot of log-probability that emphasizes them more clearly.

(From ‘HadCRUT3 T-anom log-dist 20 small.png’)

You can see particularly from the second one that the incidence of extreme outliers hasn’t changed much. (But see later.)

You can also see that the spread is a lot bigger than the change. The rise in temperatures is perceptible, but still smaller than the background variation. It does look like the upper bound has shifted upwards – about 0.5 C by eye.

To look more precisely at the change, what I did next was to divide the distribution for each month by the baseline distribution for 1900-1950.

This says whether the probability has gone up or down. I then took logarithms, to convert the ratio to a linear scale. If doubling is so many units up, then halving will be the same number of units down. And then I colored the distribution with red/yellow for and increase in probability and blue/cyan for a decrease in probability.

(From ‘HadCRUT3 T-anom dist log-change 10 small.png’)

You can see now why it was convenient for Hansen to have picked 1950-1980 to compare against!

Blue surrounded by red means a broader distribution, as in the 1850-1870 period (small gridcell sample sizes give larger variances). Red surrounded by blue means a narrower distribution, as in the 1950-1990 period. Blue over red means cooler climate, as in the 1900-1920 period. Red over blue means a warmer climate, as in the post-1998 period.

The period 1900-1950 (baseline) shows a simple shift. The period 1950-1990 appears to be a narrowing and shift. The period post 1990 shows effects of shift meeting effects of narrowing. The reduction in cold weather since 1930 unambiguous, but increase in warm weather is only clear post 1998; until then, there was a decrease in warmer weather due to narrowing of distribution. There is a step change in 1998, little change thereafter.

Generally, the distribution has got narrower over the 20th century, but jumped to be wider again around the end of the century. (This may, as Tamino suggested, be because different parts of the world warm at different rates.) The assumption that the variance is naturally constant and any change in it must be anthropogenic is no more valid than the same assumption about the mean.

So far, there’s nothing visible to justify Hansen’s claims. The step-change after 1998 is a little bigger and longer than the 1940s, but not enough to be making hyperbolic claims of orders-of-magnitude increases in probability. So what have I missed?

The answer, it turns out, is that Hansen’s main results are just for summer temperatures. The spread of temperatures over land is vastly larger in the winter than it is the summer.

Looking at the summer temperatures only, the background spread is much narrower and the shifted distribution now pokes it head out above the noise.

I’ve shown the global figures for July below. I really ought to do a split-mix of July in the northern hemisphere and January on the southern, but given the preponderance of land (and thermometers!) in the NH, just plotting July shows the effect nicely.

(From ‘HadCRUT3 T-anom dist log-change 10 July.png’)

Again, the general pattern of a warm 1940s and a narrowing of the distribution from 1950-1980 shows up, but now the post-1998 step-change is now +2 C above the background. It also ramps up earlier, with more very large excursions (beyond 5 C) showing up around 1970, and a shift in the core distribution around 1988. The transitions look quite sharp. I think this is what Hansen is talking about.

Having had a quick look at some maps of where the hot spots are (run the code), it would appear a lot of these ‘heatwaves’ are in Siberia or northern Canada. I can’t see the locals being too upset about that… That also fits with the observation that a lot of the GISSTEMP global warming is due to the extrapolation over the Arctic. That would benefit from some further investigation.

None of this gives us anything solid about heatwaves, or tells us anything about cause, of course. For all we know, the same thing might have happened in the MWP.

Run the code! It generates a lot more graphics, at full size.

# ##################################################

 # R Script

 # Comparison of HadCRUT3 T-anom distributions to 1900-1950 average

 # NiV 11/8/2012

 # ##################################################

library(ncdf)

 library(maps)

# Local file location - ***CHANGE THIS AS APROPRIATE***

 setwd("C:/Data/Climate")

# Download file if no local copy exists

 if(file.exists("hadcrut3.nc") == FALSE) {

 download("http://www.metoffice.gov.uk/hadobs/hadcrut3/data/HadCRUT3.nc","hadcrut3.nc")

 }

# HadCRUT3 monthly mean anomaly dataset. For each 5x5 lat/long gridcell

 # reports the average temperature anomaly of all the stations for a

 # given month. Runs from 1850 to date, indexed by month.

 # i.e. month 1 is Jan 1850, month 25 is Feb 1852, etc.

 # Four dimensions of array are longitude, latitude, unknown, and month.

hadcrut.nc = open.ncdf("hadcrut3.nc")

# --------------------

 # Functions to extract distributions from data

# Names of months

 month = c("January", "February", "March", "April", "May", "June",

 "July", "August", "September", "October", "November", "December")

month_to_date = function(m) {

 mn = (m %% 12) + 1

 yr = 1850 + ((m-1) %/% 12)

 return(paste(month[mn]," ",yr,sep=""))

 }

# Function to show 1 month's data

 plotmonthmap = function(m) {

 d = get.var.ncdf(hadcrut.nc,"temp",start=c(1,1,1,m),count=c(72,36,1,1))

 clrs = rev(rainbow(403))

 brks = c(-100,-200:200/20,100)

 image(c(0:71)*5-180,c(0:35)*5-90,d,col=clrs,breaks=brks,useRaster=TRUE,

 xlab="Longitude",ylab="Latitude",

 main=paste("Temperature anomalies",month_to_date(m)))

 map("world",add=TRUE)

 }

# Function to extract one month's data, as a vector of length 36*72=2592

 getmonth = function(m) {

 res=get.var.ncdf(hadcrut.nc,"temp",start=c(1,1,1,m),count=c(72,36,1,1))

 dim(res) = NULL # Flatten array into vector by deleting dimensions

 return(res)

 }

# Given a vector of month indexes, extract data for all those months as a single vector

 getmultimonths = function(mvec) {

 res=vapply(mvec,getmonth,FUN.VALUE=c(1:2592)*1.0);dim(res)=NULL

 return(res)

 }

# Function to determine the T-anom distribution for a vector of month indexes

 # Result is a vector of length 402 representing frequency of 0.1 C bins

 # ranging from -20 C to +20 C. Element 200 contains the zero-anomaly.

 # Data is smoothed slightly to mitigate poor sample size out in the tails.

 gettadist = function(mvec) {

 d = getmultimonths(mvec)

 res = table(cut(d,c(-Inf,seq(-20,20,0.1),Inf)))[]

 res = res/sum(res,na.rem=TRUE)

 res = filter(res,c(1,4,6,4,1)/16)

 return(res)

 }

# --------------------

# Draw Summer maps 1998-2012

 for(m in c(seq(1781,1949,12),seq(1782,1949,12),seq(1783,1949,12))) {

 png(paste("HadCRUT3 T-anom map ",month_to_date(m),".png",sep="") ,width=800,height=500)

 plotmonthmap(m)

 dev.off()

 }

# Calculate average distribution 1900-1950

 # Late enough to have decent data, early enough to be before global warming

 baseline = gettadist(c(600:1199))

# Plot the distribution to see that it looks sensible

 png("HadCRUT3 T-anom 1900-1950 dist.png",width=800,height=500)

 plot(c(100:300)*0.1-20.1,baseline[100:300],type="l",

 xlab="Temperature Anomaly C",ylab="Frequency /0.1 C",

 main="Temperature Anomaly Distribution 1900-1950")

 dev.off()

# --------------------

# A few functions for plotting

# Add a semi-transparent grid to a plot

 gr = function() {abline(h=c(-20:20),v=seq(1850,2010,10),col=rgb(0.6,0.6,1,0.5))}

# Plot some data

 plotdist = function(data,trange=20,isyears=FALSE,dogrid=FALSE,main) {

 if(isyears) { drange = c(1:dim(data)[1]) + 1850 } else { drange = c(1:dim(data)[1])/12 + 1850 }

 trange = c((200-trange*10):(200+trange*10))

 image(drange,trange*0.1-20.1,

 data[,trange],

 col=gray(rev(0:20)/20),useRaster=TRUE,

 xlab="Year",ylab="Temperature Anomaly C",main=main)

 if(dogrid) { gr() }

 }

# Colour scheme and breaks for change plot

 # Scale is logarithmic, so goes from 1/7.4 to 1/2.6 to 1 to 2.7 to 7.4

 redblue=c(rgb(0,1,1),rainbow(10,start=3/6,end=4/6),rainbow(10,start=0,end=1/6),rgb(1,1,0))

 e2breaks=c(-100,-10:10/5,100)

# This generates a scale for the change plots

 png("E2 Scale.png",width=200,height=616)

 image(y=exp(c(-10:10/4)),z=matrix(c(-10:10/4),nrow=1,ncol=21),

 col=redblue,breaks=e2breaks,log="y",xaxt="n",ylab="Probability Ratio")

 dev.off()

plotdistchange = function(data,baseline,trange=20,isyears=FALSE,dogrid=FALSE,main) {

 if(isyears) { drange = c(1:dim(data)[1]) + 1850 } else { drange = c(1:dim(data)[1])/12 + 1850 }

 trange = c((200-trange*10):(200+trange*10))

 bdata=data/matrix(rep(baseline,each=dim(data)[1]),nrow=dim(data)[1])

 image(drange,trange*0.1-20.1,

 log(bdata[,trange]),

 col=redblue,breaks=e2breaks,useRaster=TRUE,

 xlab="Year",ylab="Temperature Anomaly C",main=main)

 if(dogrid) { gr() }

 }

# --------------------

# Make an array of every month's distribution

 datalst = aperm(vapply(c(1:1949),gettadist,FUN.VALUE=c(1:402)*1.0))

# Plot it out for a quick look

 for(w in c(3,5,10,20)) {

 png(paste("HadCRUT3 T-anom dist ",w,".png",sep=""),width=2000,height=600)

 plotdist(datalst,w,dogrid=TRUE,main="Temperature Anomaly Distribution")

 dev.off()

 }

# and a small one

 png("HadCRUT3 T-anom dist 5 small.png",width=600,height=500)

 plotdist(datalst,5,dogrid=TRUE,main="Temperature Anomaly Distribution")

 dev.off()

# Log-probability is shown to emphasise outliers

 png("HadCRUT3 T-anom log-dist 20.png",width=2000,height=600)

 plotdist(log(datalst),20,dogrid=TRUE,main="Temperature Anomaly Distribution (log)")

 dev.off()

# and a small one

 png("HadCRUT3 T-anom log-dist 20 small.png",width=600,height=500)

 plotdist(log(datalst),20,dogrid=TRUE,main="Temperature Anomaly Distribution (log)")

 dev.off()

# Now plot the change

png("HadCRUT3 T-anom dist log-change 20.png",width=2000,height=600)

 plotdistchange(datalst,baseline,20,dogrid=TRUE,main="Change in Temperature Anomaly Distribution")

 dev.off()

# Plot the middle +/-10 C, red means more common than 1900-1950 average, blue less common

 png("HadCRUT3 T-anom dist log-change 10.png",width=2000,height=600)

 plotdistchange(datalst,baseline,10,dogrid=TRUE,main="Change in Temperature Anomaly Distribution")

 dev.off()

# and a small one

 png("HadCRUT3 T-anom dist log-change 10 small.png",width=600,height=500)

 plotdistchange(datalst,baseline,10,dogrid=TRUE,main="Change in Temperature Anomaly Distribution")

 dev.off()

# --------------------------------

 # Analysis by month

# Reserve space

 datam = rep(0,162*402*12);dim(datam)=c(12,162,402)

 basem = rep(0,402*12);dim(basem)=c(12,402)

# Generate an array of results for each month, and plot distributions and changes

 for(m in 1:12) {

 datam[m,,] = aperm(vapply(seq(m,m+161*12,12),gettadist,FUN.VALUE=c(1:402)*1.0))

 basem[m,] = gettadist(seq(m+600,m+1200,12))

png(paste("HadCRUT3 T-anom dist 5 ",month[m],".png",sep=""),width=800,height=600)

 plotdist(datam[m,,],5,isyears=TRUE,dogrid=TRUE,

 main=paste(month[m],"Temperature Anomaly Distribution"))

 dev.off()

png(paste("HadCRUT3 T-anom log-dist 20 ",month[m],".png",sep=""),width=800,height=600)

 plotdist(log(datam[m,,]),20,isyears=TRUE,dogrid=TRUE,

 main=paste(month[m],"Temperature Anomaly Distribution (log)"))

 dev.off()

for(w in c(10,15,20)) {

 png(paste("HadCRUT3 T-anom dist log-change ",w," ",month[m],".png",sep=""),width=800,height=600)

 plotdistchange(datam[m,,],basem[m,],w,isyears=TRUE,dogrid=TRUE,

 main=paste(month[m],"Change in Temperature Anomaly Distribution"))

 dev.off()

 }

 }

# Done!

# --------------------------------

# Interpretation

 # --------------

 # Blue surrounded by red means a broader distribution, as in the 1850-1870 period

 # (small gridcell samples give larger variances).

 # Red surrounded by blue means a narrower distribution, as in the 1950-1990 period.

 # Blue over red means cooler climate, as in the 1900-1920 period.

 # Red over blue means a warmer climate, as in the post-1998 period.

# Observations

 # ------------

 # Period 1900-1950 (baseline) shows a simple shift.

 # Period 1950-1990 appears to be a narrowing and shift.

 # Period post 1990 shows effects of shift meeting effects of narrowing.

 # Reduction in cold weather since 1930 unambiguous, but increase in warm weather only clear post 1998;

 # until then, there was a decrease in warmer weather due to narrowing of distribution.

 # Step change in 1998, little change thereafter.

 # Post 2000 change is a bit bigger and longer lasting but not all that dissimilar to 1940s.

# Monthly

 # -------

 # Picture looks very different when split out by month!

 # Spread in summer is far narrower than in winter.

 # Offset in summer in 21st C exceeds upper edge of distribution from 1900-1950.

 # Excess over 1940s looks about 0.5 C.

 # Looking at maps, a lot of it appears to be in Siberia and Northern Canada.

 # There is still a narrowing of the distribution 1950-1990.

 # Step change occurs a little earlier, around 1990, to 1940s levels

 # then jumps again in 1998 to higher level. This is within 20th C annual bounds

 # but outside narrower summer bounds.

# Caveats

 # -------

 # The data is averaged over 5x5 degree gridcells monthly. Spread of

 # daily data would be 2-5 times broader (?).

 # Spread at point locations would be broader still.

 # Note also 'great dying of the thermometers' around 1990/2003 - could affect variance.

 # Has been reported prior to Soviet collapse cold weather exaggerated, to get fuel subsidy (?).

 # Distribution extremes poorly sampled, results unreliable beyond +/-5 C.

 # (May be able to do better with more smoothing?)

 # Temp anomalies outside +/-10 C occur fairly often, 'heatwaves' probably in this region.

 # No area weighting - gridcells nearer poles represent smaller area.

 # (This is intended to look at point distributions, not global average, though.

 # Area weighting arguably not appropriate.)

 # No error estimates have been calculated.

 # All the usual stuff about UHI, adjustments, homogenisation, LIA/MWP, etc.

# ##################################################
0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

39 Comments
Inline Feedbacks
View all comments
JJ
August 14, 2012 12:28 am

Nick Stokes says:
All the base period does is express a reference value that you subtract from the grid or station value to get an anomaly.

That is not correct. Your understanding of base period and anomaly WRT trend analysis is not applicable here. This is not a trend analysis. It is a dolled-up analysis of variance. And Dolly has been sorely mistreated.
I understand that it is difficult to comprehend what they did. In a real scientific paper, that would be explained in the Methods section in sufficient detail that you could theoretically replicate their analysis. Unfortunatley, this is not a real scientific paper. It is a political essay gaudily festooned with maps and graphs. Thus, the Methods section contains no description of the methods. It should really be called the Rationalization section, because all it does is attempt to fig-leaf the obviously contrived selection of the base period. So, looking at the Methods section doesn’t help you like it should.
Still, you are a bright guy. Between the way they describe their results (which comprise perhaps 30% of the Results section, the balance being essays on “loaded dice” metaphors and other such crap that should properly have been put in the Political Discussion section, which in turn should not appear in a scientific paper) and my criticism thereof (you did read what you are putatively responding to, didn’t you?), you should be able to discern the argument that they are making WRT comparisons of data against the variability of the base period and the “extreme results” that they gin up by illegitimately restricting said base period to a time frame that they admit they chose because it does not vary much with respect ot the rest of the dataset.
You may have trouble believing that they did something as trivial and yet as ballsy as they did, but keep in mind the level of arrogance and “misson from God” attitude of entitlement with which you are confronted.

Nick Stokes
August 14, 2012 4:35 am

JJ,
“you did read what you are putatively responding to, didn’t you?”
A discerning question, but the key word is putatively. I was not respopnding to the paper querying what effect was being attributed in this post to the base period. In fact, when (as prodded) I read the PNAS paper more thoroughly, I see better what you mean.
But I see more. Hansen did compare the effect of different base periods. In Fig 2 he shows the sd patterns using 1951-80 (something of a low point with variation) and 1981-2010 (a high point). And he gives the numerical averages, and they aren’t that different. 1981-2010 was a period of significant trend, and he shows a detrended result, which is really the one that counts. And yes, he probably should have shown 1951-90 detrended; but with the lower trend, it’s probably little different.

JJ
August 14, 2012 7:28 am

Nick Stokes says:
But I see more. Hansen did compare the effect of different base periods.

Don’t watch his hands. Watch the pea.
Now, read the second paragraph of my post.
And feel free to finish it, if you like :).

Pamela Gray
August 14, 2012 7:33 am

This is way cool!!!!! Nature’s temperature walk, even when acted upon by humans, rarely, exceedingly rarely, causes an almost mechanical signal to show up in raw data. Your color coded graph has uncovered a non-random signal in the later period of your data set, that would lead me to immediately suspect an artifact. An artifact in this context is something not caused by nature and certainly not by humans as this source signal raw data set has yet to rise above the noise of nature’s raw data. However, this data is not exactly raw. It has been homogenized. It is as if the anomaly has bumped up against a baseline that it is not allowed to transgress. Do I see the telltale fingerprint of bias here?
Can this graph be overlayed with homogenization events, station drop-out, a point in time when someone changed the algorithm moving forward? A splicing event? It just screams artifact to me.

wayne
August 14, 2012 8:52 am

Pamela, I agree. See my comment above. I’ve never seen nature just changes one day to never return and you can see this in the colored plots. Seems to me to be a change in measurement methods of some sort or possibly in the post-measurement adjustment algorithms..
This ‘R’ code produces a wealth of views (nearly 120 .png files) and if anyone is having a problem downloading ‘R’ and running this just ask in a comment, I’ll help, there are a couple of changes necessary if your OS is Windows.
Nullius, thanks for that code.

Nullius in Verba
August 14, 2012 2:31 pm

“Was this a shift in the climate by nature or was this a change in either the instrumentation or the methods applied in measuring or adjusting?”
Good question. I don’t know the answer.
If you generate and look at the individual maps for the early years, you will see that global coverage was pretty poor up until the 1930s. One interesting experiment might be to pick just those gridcells for which continuous records exist, and have a look at the trend constructed just from those.
“No, I can’t. What difference do you think the choice of anomaly base period makes?”
Hansen is exploring not just the change in the mean but also the change in the spread. The contrast is greater if you pick the baseline in the period with the narrowest spread.
It’s clear even with the 1900-1950 baseline that the variance has still increased post-98, of course. But it could easily be assumed that the variance during the baseline was the normal state of affairs up until AGW.
“I believe that to investigate the change in distribution we should compensate for changes in the mean first”
It depends what aspect you are interested in. If you’re just interested in the spread, the point variability of the weather, then yes, you have to detrend and centre. But the trend and centre are properties of the distribution, then you also have to consider the effect without such adjustments.
When faced with such decisions, I alway prefer to solve the dilemma by doing both, and thus getting an understanding of the effects the decision has. More information gives a more complete understanding of what is really going on.
“Distribution in these appears much sharper and having greater variability, probably partially due to finer grid.”
I should probably mention that the distributions are smoothed slightly, so as to be able to extend the plot further into the outliers. Also, you have to plot it at the full resolution to be able to see it – but the image is 2000 pixels across and would make a mess in a blog post, so the one you see here is a compressed version.
“IMO what is needed is measurement of the wet bulb temperature as a more accurate measure”
Frankly, I’m highly sceptical about the quality of the data anyway, for all sorts of reasons. But I didn’t want to get into that. The idea was just to look at the data we had, to get a handle on what Hansen was talking about. Only when you understand it can you criticise it.
“I would have thought that before discussing anomalies we should discuss temperature accuracy.”
Yep. But see my previous comment.
“Anthony: I believe the “download” call should be “download.file” instead.”
Quite right! Sorry about that!
(Mods: is it posible to change the
download(“http://www.metoffice…
to
download.file(“http://www.metoffice…
please?)
“Your color coded graph has uncovered a non-random signal in the later period of your data set, that would lead me to immediately suspect an artifact.”
Artefacts are a possibility, but I would also suspect quasi-stability as a possibility. It jumps from one locally stable state to another. The “great climate shift” in the 1970s when the PDO changed character is one example.
I have long thought that the rise in mean temperature over the late 20th century looked less like a rising linear trend than it did a series of steps. In 1998 it stepped up, overshot and oscillated back down again before settling at the new level. (Gibbs phenomenon.)
If so, it would seem less likely to be due to a gradual rise in greenhouse forcing, although that isn’t ruled out. But frankly, I don’t know. It’s just speculation.
“Can this graph be overlayed with homogenization events, station drop-out, a point in time when someone changed the algorithm moving forward?”
If you can find the data, feel free. I saw some nice pictures of ‘the great dying of the thermometers’ on Jo Nova’s site a while back.
My thanks to everyone who said they liked it. As a bonus, if you type in
plot(log(baseline))
the resulting curve should be a parabola (like a graph of -x^2) if the distribution is Gaussian, as Hansen supposes in his paper. I’ll let you judge for youself if that thing looks like a parabola.
Taking standard deviations of a distribution with heavy tails is well-known not to be robust to outliers. Standard deviation is heavily affected by outliers, and statisticians normally use more robust measures of spread with such distributions. I’m not sure whether that has any significant effect in this case, though.

August 14, 2012 9:20 pm

Just in case there are any other closet climate newbies who couldn’t quite grasp the discussion points becasue of a lack of definition, this can clarify:
Climatologies and Standardized Anomalies
http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Climatologies/

George E. Smith;
August 14, 2012 11:29 pm

“””””…..day by day says:
August 14, 2012 at 9:20 pm …..”””””
Well your authority asserts that “climatoligies” are the 20-30 year “long term”averages; presumably of weather.
I would say that climate is the definite integral of weather; the summation of everything that happened before.
And don’t go giving me that well they differ by a constant factor. That is true only of a perfectly linear system. Wake me up the next time you discover ANY perfectly linear system. Certainly no such thing on the weather front.

AJ
August 15, 2012 8:10 pm

I downloaded the NCDC CONUS (USA48) climate division temperature data and calculated the standard deviation for each 11 year period from 1895-1905 to 2001-2011. The periods centered between 1960 to 1980 show unusually low values. The trend is steeply upward for the periods centered from 1980 to 2006, but they are not unusual compared to the pre-1960 values. That is, for the continental U.S. at least, recent variability is not unusual compared to the historical record. Perhaps the northern hemisphere as a whole experienced unusually persistent zonal wind patterns during the baseline period.
A few notes. The NDCD data has 344 divisions. For each divisional 11 year period, the temperatures were detrended and the residuals retained. I did this to exclude the trend in the “DC” signal, where as I was only interesting in the “AC” signal. For each period, the combined residuals from all divisions were used to calculate the stdev.
Here’s the source. Doubt if my tags will prevent wordpressderdization of the code. Sorry, not really polished or commented adequately at this stage. It takes a few minutes to run.
[sourcecode language=”R” wraplines=”false” collapse=”false”]
# plot stdev of USA48 anomolies for 11yr moving periods 1895:1905 – 2001:2011
#
# download ftp://ftp.ncdc.noaa.gov/pub/data/cirs/drd964x.tmp.txt and place in working directory
#
# read file into dataframe
wdths=c(4,2,4,7,7,7,7,7,7,7,7,7,7,7,7)
cnames=c("div","attr","yr","m01","m02","m03","m04","m05","m06","m07","m08","m09","m10","m11","m12")
colclss=c("character","character","integer","numeric","numeric","numeric","numeric","numeric","numeric",
"numeric","numeric","numeric","numeric","numeric","numeric")
tmpdf=read.fwf("drd964x.tmp.txt",widths=wdths,col.names=cnames,colClasses=colclss)
nyrs = 11 # 11 year sample periods
maxyr = max(tmpdf$yr)
minyr = min(tmpdf$yr)
endyr = maxyr – nyrs # sample will exclude maxyr (could contain -99.99’s)
divs = unique(tmpdf$div)
denslst = NULL
sdvec = NULL
densx = NULL
densy = NULL
t = c(0,1/12,2/12)
t = c(t,t+1,t+2,t+3,t+4,t+5,t+6,t+7,t+8,t+9,t+10)
for (i in minyr:endyr){
endsamp = i + nyrs – 1
sampyrs = i:endsamp
yrtmpdf = subset(tmpdf,yr %in% sampyrs)
anomvec = NULL
for (sampdiv in divs){
divtmpdf = subset(yrtmpdf,div==sampdiv)
divtmp = rbind(divtmpdf$m06,divtmpdf$m07,divtmpdf$m08)
dim(divtmp) = length(divtmp)
divlm = lm(divtmp ~ t)
divanom = divtmp – predict(divlm)
anomvec = c(anomvec,divanom)
}
s = sd(anomvec)
sdvec = c(sdvec,s)
d = density(anomvec)
denslst = c(denslst,list(d))
if (i %% 10 == 0){
densx = rbind(densx,d$x)
densy = rbind(densy,d$y)
}
}
centered_dates = (minyr:endyr)+5
plot(centered_dates,sdvec,type=’l’)
[/sourcecode]

JJ
August 15, 2012 9:23 pm

Nullius in Verba says:
Hansen is exploring not just the change in the mean but also the change in the spread.

Both of which are determined by the choice of the ‘base period’.
If Hansen had shifted or expanded his selection of the base period to include the warm years prior to 1950, not only would the variation have increased, but the mean would have been nudged up as well. Instead, he rigged the outcome by picking a period with lower variance and a lower mean. This increases the magnitude of the current anomalies, lowers the bar of what constitutes an “extreme event”, and thus exaggerates his “results” to gin up the scary story. (you still there Nick?)
Roy Spencer has posted a write-up on that part of the scam, running the numbers against alternate base periods here:
http://www.drroyspencer.com/2012/08/fun-with-summer-statistics-part-2-the-northern-hemisphere-land/
Many thanks to guys like yourselves for taking the time to examine the details!

AJ
August 16, 2012 6:18 am

A couple of other notes about my code above. One, I should have just plunked the URL into the read.fwf function to make it turnkey ready. Two, I debated whether to calculate the monthly anomolies before detrending. The code above does not do this. If I do include this step then the general pattern is mostly the same and my observations still stand. The largest difference is that the stdev is reduced by ~1C.

wayne
August 17, 2012 9:16 pm

Nullius in Verba:
Here’s one curiosity to pair up with your third plot in this post. I could swear I had seen some graph in the past that looked basically like yours and after digging through the some 10,000 climate ‘science’ files and data collected over the years I found it, and it was one of my graphs! This was a special plot I created and sent to Anthony way back when doing some (possibly meaningless) analysis on periods of the sunspot counts and transitions since the 1700’s using a novel smoothing, or more properly flattening algorithm, and it has a curious similarity to your plot:
http://i48.tinypic.com/289vef.png
They both take an unmistakable upward step in the mid-thirties. Now that is curious. Could this be just a coincidence? Hmm…

Wayne2
August 22, 2012 6:18 am

(I’m not “wayne”. Strange how the Wayne’s would get the last posts, eh?)
I did a quick calculation of July’s largest-sigma temperatures in each of the eras 1920-1949, 1950-1979, and 1980-2009 and made a graph of the US. I used only GHCNv3 stations that had at least 27 (of 30) July observations for each era, and used the 1950-1979 era’s SD as the baseline for all three eras. I detrended each era with a linear fit before calculating means or SD’s. The color indicates which era that station’s maximum positive deviation occurred, and the size indicates how many 1950-1979 sigmas it was. For reference, the tiny and huge circles below the California are 0.1 and 6 sigmas, respectively.
http://tinypic.com/r/148qrp/6

Ed B
August 24, 2012 1:19 pm

Very late to the party here, but if the curve was shifted as Hansen suggests, in addition to a vast increase in the number of high temperature events, wouldn’t there be a corresponding massive decrease in the number of low temperature events? Has anyone looked at that data?