Guest Post by Willis Eschenbach
Anthony Watts, Lucia Liljegren , and Michael Tobis have all done a good job blogging about Jeff Masters’ egregious math error. His error was that he claimed that a run of high US temperatures had only a chance of 1 in 1.6 million of being a natural occurrence. Here’s his claim:
U.S. heat over the past 13 months: a one in 1.6 million event
Each of the 13 months from June 2011 through June 2012 ranked among the warmest third of their historical distribution for the first time in the 1895 – present record. According to NCDC, the odds of this occurring randomly during any particular month are 1 in 1,594,323. Thus, we should only see one more 13-month period so warm between now and 124,652 AD–assuming the climate is staying the same as it did during the past 118 years. These are ridiculously long odds, and it is highly unlikely that the extremity of the heat during the past 13 months could have occurred without a warming climate.
All of the other commenters pointed out reasons why he was wrong … but they didn’t get to what is right.
Let me propose a different way of analyzing the situation … the old-fashioned way, by actually looking at the observations themselves. There are a couple of oddities to be found there. To analyze this, I calculated, for each year of the record, how many of the months from June to June inclusive were in the top third of the historical record. Figure 1 shows the histogram of that data, that is to say, it shows how many June-to-June periods had one month in the top third, two months in the top third, and so on.
Figure 1. Histogram of the number of June-to-June months with temperatures in the top third (tercile) of the historical record, for each of the past 116 years. Red line shows the expected number if they have a Poisson distribution with lambda = 5.206, and N (number of 13-month intervals) = 116. The value of lambda has been fit to give the best results. Photo Source.
The first thing I noticed when I plotted the histogram is that it looked like a Poisson distribution. This is a very common distribution for data which represents discrete occurrences, as in this case. Poisson distributions cover things like how many people you’ll find in line in a bank at any given instant, for example. So I overlaid the data with a Poisson distribution, and I got a good match
Now, looking at that histogram, the finding of one period in which all thirteen were in the warmest third doesn’t seem so unusual. In fact, with the number of years that we are investigating, the Poisson distribution gives an expected value of 0.2 occurrences. In this case, we find one occurrence where all thirteen were in the warmest third, so that’s not unusual at all.
Once I did that analysis, though, I thought “Wait a minute. Why June to June? Why not August to August, or April to April?” I realized I wasn’t looking at the full universe from which we were selecting the 13-month periods. I needed to look at all of the 13 month periods, from January-to-January to December-to-December.
So I took a second look, and this time I looked at all of the possible contiguous 13-month periods in the historical data. Figure 2 shows a histogram of all of the results, along with the corresponding Poisson distribution.
Figure 2. Histogram of the number of months with temperatures in the top third (tercile) of the historical record for all possible contiguous 13-month periods. Red line shows the expected number if they have a Poisson distribution with lambda = 5.213, and N (number of 13-month intervals) = 1374. Once again, the value of lambda has been fit to give the best results. Photo Source
Note that the total number of periods is much larger (1374 instead of 116) because we are looking, not just at June-to-June, but at all possible 13-month periods. Note also that the fit to the theoretical Poisson distribution is better, with Figure 2 showing only about 2/3 of the RMS error of the first dataset.
The most interesting thing to me is that in both cases, I used an iterative fit (Excel solver) to calculate the value for lambda. And despite there being 12 times as much data in the second analysis, the values of the two lambdas agreed to two decimal places. I see this as strong confirmation that indeed we are looking at a Poisson distribution.
Finally, the sting in the end of the tale. With 1374 contiguous 13-month periods and a Poisson distribution, the number of periods with 13 winners that we would expect to find is 2.6 … so in fact, far from Jeff Masters claim that finding 13 in the top third is a one in a million chance, my results show finding only one case with all thirteen in the top third is actually below the number that we would expect given the size and the nature of the dataset …
w.
Data Source, NOAA US Temperatures, thanks to Lucia for the link.
The post on Lucia’s blog which I found most accessibly to demolish Masters’ statistical simpemindedness (or ignorance) is by Climatebeagle. It demonstrates the mental & math error underlying the 1:1.6 million assertion so ably that a complete statistical stupe could grasp it:
climatebeagle (Comment #99257)
July 10th, 2012 at 2:54 pm
Using the same logic as Jeff Masters and looking at the US data:
5 consecutive months in top-third should occur every 20 years but have occurred 18 times in 116 years.
6 consecutive months in top-third should occur every 60 years but have occurred 11 times in 116 years.
7 consecutive months in top-third should occur every 182 years but have occurred 4 times in 116 years.
8 consecutive months in top-third should occur every 546 years but have occurred 3 times in 116 years.
@Nigel Harris, you said “What Jeff Masters actually said was: *Each* of the 13 months from June 2011 through June 2012 ranked among the warmest third of *their* historical distribution for the first time in the 1895 – present record.”
No, Jeff Masters was looking at a thirteen month period, not individual months compared to the same months over a period. To quote from the original – “Thus, we should only see one more 13-month period so warm between now and 124,652 AD”.
I take this as a reasonable person to mean a period of 13 consecutive months.
Willis Eschenbach says:
July 11, 2012 at 12:51 am
“The controversy is that the “ridiculously long odds” he refers to are wildly incorrect …”
Yes, well, that much is obvious. But, I don’t think yours is necessarily far better, as it is still treating the data as if it were stationary random data. Lots of distributions look a lot like the Poisson distribution. My point is, this isn’t a fight worth fighting. All he is saying is that temperatures have risen. They have, though they no longer are. It says nothing about attribution.
JJ says:
July 11, 2012 at 7:39 am
“They both compare the probability of the current event against the assumption of zero change in climate, none whatsoever, over the last 118 years. “
Exactly.
AGW heatwave downed by a poisson pen.
I work in the private sector, and actually get paid to do things like apply Poisson and Negative Binomial distributions correctly.
I would hire pjie2 to do this work with me. Willis, not so much.
In this case, lambda is known by definition, and the data are not independent.
Willis’s analysis demonstrates nothing other than this. In his own words: “picking the appropriate model for the situation is the central, crucial, indispensable, and often overlooked first step of any statistical analysis”. As Willis demonstrates here- by picking the wrong model.
As much as I despair about errors such as these, I despair more about those people who eat it up, uncritically.
John West said: “It seems to me that the climate change action advocates are not trying to understand the data but are trying to understand how to use the data to promote the cause.”
Pot, meet the kettle.
I mean, even if we take Willis’ argument at face value, the odds are still 2.6 in 1374, or about 0.2%. Those are still long odds. Does that in any way lend credence to the argument that observed warming of the globe in the latter third of the 20th century can be blamed on humans? Not in the slightest.
Willis,
Any comment on Lucia’s updated estimate, and the estimate calculated by Tamino?
http://tamino.wordpress.com/2012/07/11/thirteen/#more-5309
BCC says:
July 11, 2012 at 9:50 am
“I work in the private sector, and actually get paid to do things like apply Poisson and Negative Binomial distributions correctly. I would hire pjie2 to do this work with me. Willis, not so much.”
This is why I enjoy this site so much. Eventually someone shows up who understands the issue. It doesn’t ever seem to matter what the issue is either, as they range from logging, to fire suppression, to nuclear reactors, to tsunamis, and here, to statistical analysis.
As for all those kudos extended to Willis for his analysis, I’ve always found that it’s better to wait until the Nigels, pjie2s, and BCCs show up before jumping into unknown waters too quickly. Regardless, this site is, I believe, unequaled for its ability to draw out explanations from those intimately involved whatever issue is being discussed. The comments are nearly always as valuable a read as the original article, and often more so.
(To BigBadBear at 8:55 am: The 1/3 to the 13th power is correct assuming randomness, which was my assumption. I was just explaining the derivation of the NCDC calculation. Their assumption of randomness was incorrect, however.)
Willis,
You have a good start on the analysis, but you need to take an extra step – you need to de-trend the data. Find the best linear fit to your data set – temperature vs. time. Then subtract that trend from each month’s temperatures – this is the data that you need to analyze for the expected frequency of top thirds. The odds you calculated, 2.6 / 1374 are the odds given whatever linear trend exists in the data – your number may or may not be influenced by the trend.
A much more useful exercise would be to split the data into two periods – the first half and the second half. For each half, figure out how many months in each contiguous year are in the top third of the whole data set. Figure out if the odds of seeing a contiguous year in the top third has changed from the first ~58 years to the second ~58 years. So, for example, if the first half yields 2.6 and the second half gives the same number, then the probability of setting this record hasn’t changed.
For those worrying why lambda = 5.2+ and not *exactly* 4.333, I want to remind you that this is not continuous data. This study involves discrete data, in which monthly readings fall into ‘buckets’ that do not permit fractions. For example, given the set of four numbers [1,2,2,3], one finds that there are three data points in the top *half* of the distribution, because the #2 bucket cannot be subdivided. For the same reason a ‘tie’ to an old temperature record counts as a new temperature record.
@Bart: Correct. Masters’ point is this:
These are ridiculously long odds, and it is highly unlikely that the extremity of the heat during the past 13 months could have occurred without a warming climate.
Where “a warming climate” means a climate that has warmed recently (at least, that’s what it means in terms of the math we’re using to analyze it).
We don’t see many people still claiming that the US isn’t warming (or, hasn’t warmed), but if you do: well, here’s another piece of evidence which would indicate that they’re wrong.
I haven’t had time to read all the comments and I really need to re-read/digest the article, but I was under the impression that a Poisson Distribution applies only when the event in question occurs at a known average rate and is not dependent on the time since the previous event (the degree of randomness?). I’m not sure this applies in a valid way to Willis’ argument from a brief read through?
I think it was really the NCDC that did the original 1,594,323 calculation (and it is for 13 months not 12 months which is ). They wrote:
“Warmest 12-month consecutive periods for the CONUS
These are the warmest 12-month periods on record for the contiguous United States. During the June 2011-June 2012 period, each of the 13 months ranked among the warmest third of their historical distribution for the first time in the 1895-present record. The odds of this occurring randomly is 1 in 1,594,323. The July 2011-June 2012 12-month period surpassed the June 2011-May 2012 period as the warmest consecutive 12-months that the contiguous U.S. has experienced.”
Here: click on “Warmst 12 month consecutive periods for CONUS”
http://www.ncdc.noaa.gov/sotc/national/2012/6/supplemental
In the raw records, of course, 1934 would tie the current 13 month average.
Bart says:
I mean, even if we take Willis’ argument at face value, the odds are still 2.6 in 1374, or about 0.2%. Those are still long odds.
In order to take Willis’ argument at face value, one has to understand what Willis’ argument is. I dont think most here understand it, and that includes Willis.
The gist of the argument presented is that if you find a Poisson distribution that approximates some observations, then those observations are likely to fall near that Poisson distribution. That isnt a particularly interesting argument, being a tautology. It says nothing about the climate, nor about the claims made by Jeff Masters. The odds claimed by Masters and the odds claimed by Willis are irrelevant to each other. Comparing them is meaningless, and neither of them is individually of any interest to the question at hand.
JJ
mb says:
July 11, 2012 at 3:21 am
Thanks, mb. The model is not predicting how many months will come up out of 13. It is predicting how many months will come up. You are correct that there will be an “edge effect”, since we are only looking at 13-month intervals. But since it only affects ~ one case in 1400, the effect will be trivially small.
Suppose that in fact there is one run of 14 in the data. Since we are counting in 13-month intervals, in the first case (June to June only) it will be counted as a run of 13. And in the second case (all 13 month intervals) it will be counted as two runs of 13 … but in neither case does that materially affect the results shown above.
So in practice, the edge effect slightly increases the odds of finding a run of 13.
w.
Nigel Harris says:
July 11, 2012 at 3:42 am
See my response above. Your objection is real but makes no practical difference.
w.
cd_uk says:
July 11, 2012 at 5:41 am
OK, great, you’re right. And Mathworld is wrong …
Keep believing that, cd_uk, hold tight to that, it seems important to you. Meanwhile, in the real world, such trivial differences as you point out are roundly ignored.
w.
Is this not like the game of the extreamly high probability of finding two people born on the same day of the month (not same month) (any two people born on the same date, e,g the 13th) in a room of more than 15 people?
Maybe he punched in seconds- close enough for government work.
I note that cd_uk’s use of punctuation is as sloppy as his reasoning.
Nigel Harris says:
July 11, 2012 at 6:39 am
I have assumed all along that it has a high mean value because the data is autocorrelated. This pushes the distribution to be “fat-tailed”, increasing the probability that we will find larger groups and decreasing the probability of smaller groups.
pjie2 says:
July 11, 2012 at 6:21 am
I disagree. We have not shown it is not a Poisson distribution. We have shown that it is a special kind of Poisson distribution, a “fat-tailed” Poisson distribution where all results are shifted to somewhat higher values.
Can we draw conclusions from that? Because of the agreement of the calculated “lambda” in the smaller and larger datasets, along with the smaller RMS error in the larger datasets, I say that we can, because the distribution actually represents and accurately describes the data.
And since the distribution and the data agree, since the distribution accurately describes the data, it doesn’t matter how we arrived at that distribution, or how it is calculated.
My thanks to both of you,
w.
PS—Upon reflection, I see that you are right, that I shouldn’t have fit the value of lambda. I should have used the mean of the actual data … and in fact, the mean of the first analysis (June to June) is 5.15, while the mean of the second data is 5.17. By fitting, I had gotten a value of 5.21 for both, a trivial difference … which shows definitively that it is indeed a Poisson process, and so your objections in both cases do not apply. I’ve added an update to the head post acknowledging my error, and thanking you both for pointing it out.
Nigel Harris says:
July 11, 2012 at 7:21 am
No, that’s not what I looked at at all. I looked to see whether June of year X was in the top third of Junes, July of year X was in the top third of Julys, and so on. That’s why I got the same answer that Jeff Masters got, that June 2011 to June 2012 was the only interval with 13 months all in the warmest third. If I’d done what you claim above, I wouldn’t have found that.
w.
Rod Everson says:
July 11, 2012 at 8:26 am
Sorry, but your common sense has failed you, and your lack of a stats background is showing. See my post above.
w.
verbal1 says:
July 11, 2012 at 8:35 am
Why would I want to differentiate between “what we would expect without warming from what we would expect with warming.” I’m just looking at the data, and seeing from the data what the distribution is. Yes, the distribution would be different if the globe hadn’t been warming for centuries … so what? I’m looking simply at the odds of finding 13 out of 13 months in the warmest third.
w.
BCC says:
July 11, 2012 at 9:50 am
Sorry, BCC, but in fact I have shown above that this is the right model. How have I shown it? Because the value that I got from iteratively fitting lambda is almost exactly that of the theoretical lambda, which as you point out is “known by definition” in both cases. I have added an update to the head post discussing this.
You would have known this, BCC, if you had bothered to download the data and do the math yourself before uncapping your electronic pen … me, I wouldn’t hire you to do any work with me, you give opinions without first doing your homework.
w.