
Jeff Id of The Air Vent emailed me today inviting me to repost Ryan O’s latest work on statistical evaluation of the Steig et al “Antarctica is warming” paper ( Nature, Jan 22, 2009) I thought long and hard about the title, especially after reviewing the previous work from Ryan O we posted on WUWT where the paper was dealt a serious blow to “robustness”. After reading this latest statistical analysis, I think it is fair to conclude that the paper’s premise has been falsified.
Ryan O, in his conclusion, is a bit more gracious:
I am perfectly comfortable saying that Steig’s reconstruction is not a faithful representation of Antarctic temperatures over the past 50 years and that ours is closer to the mark.
Not only that, Ryan O did a more complete job of the reconstruction than Steig et al did, he mentions this in comments at The Air Vent:
Steig only used 42 stations to perform his reconstruction. I used 98, since I included AWS stations.
The AWS stations have their problems, such as periods of warmer temperatures due to being buried in snow, but even when using this data, Ryan O’s analysis still comes out with less warming than the original Steig et al paper
Antarctica as a whole is not warming, the Antarctic peninsula is, which is signficantly removed climatically from the main continent.

It is my view that all Steig and Michael Mann have done with their application of RegEm to the station data is to smear the temperature around much like an artist would smear red and white paint on a pallete board to get a new color “pink” and then paint the entire continent with it.
It is a lot like “spin art” you see at the county fair. For example, look (at left) at the different tiles of colored temperature results for Antarctica you can get using Steig’s and Mann’s methodology. The only thing that changes are the starting parameters, the data remains the same, while the RegEm program smears it around based on those starting parameters. In the Steig et al case, PC and regpar were chosen by the authors to be a value of 3. Chosing any different numbers yields an entirely different result.
So the premise of the Steig et al paper paper boils down to an arbitrary choice of values that “looked good”.
I hope that Ryan O will write a rebuttal letter to Nature, and/or publish a paper. It is the only way the Team will back down on this. – Anthony
UPDATE: To further clarify, Ryan O writes in comments:
“Overall, Antarctica has warmed from 1957-2006. There is no debating that point. (However, other than the Peninsula, the warming is not statistically significant. )
The important difference is the location of the warming and the magnitude of the warming. Steig’s paper has the warming concentrated on the Ross Ice Shelf – which would lead you to entirely different conclusions than having a minimum on the ice shelf. As far as magnitude goes, the warming for the continent is half of what was reported by Steig (0.12 vs. 0.06 Deg C/Decade).
Additionally, Steig shows whole-continent warming from 1967-2006; this analysis shows that most of the continent has cooled from 1967-2006. Given that the 1940’s were significantly warmer in the Antarctic than 1957 (the 1957-1960 period was unusually cold in the Antarctic), focusing on 1957 can give a somewhat slanted picture of the temperature trends in the continent.”
Ryan O adds later: “I should have said that all reconstructions yield a positive trend, though in most cases the trend for the continent is not statistically significant.”
Verification of the Improved High PC Reconstruction
Posted by Jeff Id on May 28, 2009
There is always something going on around here.
Up until now all the work which has been done on the antarctic reconstruction has been done without statistical verification. We believed that they are better from correlation vs distance plots, the visual comparison to station trends and of course the better approximation of simple area weighted reconstructions using surface station data.
The authors of Steig et al. have not been queried by myself or anyone else that I’m aware of regarding the quality of the higher PC reconstructions. And the team has largely ignored what has been going on over on the Air Vent. This post however demonstrates strongly improved verification statistics which should send chills down their collective backs.
Ryan was generous in giving credit to others with his wording, he has put together this amazing piece of work himself using bits of code and knowledge gained from the numerous other posts by himself and others on the subject. He’s done a top notch job again, through a Herculean effort in code and debugging.
If you didn’t read Ryan’s other post which led to this work the link is:
——————————————————————————–
HOW DO WE CHOOSE?
In order to choose which version of Antarctica is more likely to represent the real 50-year history, we need to calculate statistics with which to compare the reconstructions. For this post, we will examine r, r^2, R^2, RE, and CE for various conditions, including an analysis of the accuracy of the RegEM imputation. While Steig’s paper did provide verification statistics against the satellite data, the only verification statistics that related to ground data were provided by the restricted 15-predictor reconstruction, where the withheld ground stations were the verification target. We will perform a more comprehensive analysis of performance with respect to both RegEM and the ground data. Additionally, we will compare how our reconstruction performs against Steig’s reconstruction using the same methods used by Steig in his paper, along with a few more comprehensive tests.
To calculate what I would consider a healthy battery of verification statistics, we need to perform several reconstructions. The reason for this is to evaluate how well the method reproduces known data. Unless we know how well we can reproduce things we know, we cannot determine how likely the method is to estimate things we do not know. This requires that we perform a set of reconstructions by withholding certain information. The reconstructions we will perform are:
1. A 13-PC reconstruction using all manned and AWS stations, with ocean stations and Adelaide excluded. This is the main reconstruction.
2. An early calibration reconstruction using AVHRR data from 1982-1994.5. This will allow us to assess how well the method reproduces the withheld AVHRR data.
3. A late calibration reconstruction using AVHRR data from 1994.5-2006. Coupled with the early calibration, this provides comprehensive coverage of the entire satellite period.
4. A 13-PC reconstruction with the AWS stations withheld. The purpose of this reconstruction is to use the AWS stations as a verification target (i.e., see how well the reconstruction estimates the AWS data, and then compare the estimation against the real AWS data).
5. The same set of four reconstructions as above, but using 21 PCs in order to assess the stability of the reconstruction to included PCs.
6. A 3-PC reconstruction using Steig’s station complement to demonstrate replication of his process.
7. A 3-PC reconstruction using the 13-PC reconstruction model frame as input to demonstrate the inability of Steig’s process to properly resolve the geographical locations of the trends and trend magnitudes.
–
Using the above set of reconstructions, we will then calculate the following sets of verification statistics:
–
1. Performance vs. the AVHRR data (early and late calibration reconstructions)
2. Performance vs. the AVHRR data (full reconstruction model frame)
3. Comparison of the spliced and model reconstruction vs. the actual ground station data.
4. Comparison of the restricted (AWS data withheld) reconstruction vs. the actual AWS data.
5. Comparison of the RegEM imputation model frame for the ground stations vs. the actual ground station data.
–
The provided script performs all of the required reconstructions and makes all of the required verification calculations. I will not present them all here (because there are a lot of them). I will present the ones that I feel are the most telling and important. In fact, I have not yet plotted all the different results myself. So for those of you with R, there are plenty of things to plot.
Without further ado, let’s take a look at a few of those things.
You may remember the figure above; it represents the split reconstruction verification statistics for Steig’s reconstruction. Note the significant regions of negative CE values (which indicate that a simple average of observed temperatures explains more variance than the reconstruction). Of particular note, the region where Steig reports the highest trend – West Antarctica and the Ross Ice Shelf – shows the worst performance.
Let’s compare to our reconstruction:
There still are a few areas of negative RE (too small to see in this panel) and some areas of negative CE. However, unlike the Steig reconstruction, ours performs well in most of West Antarctica, the Peninsula, and the Ross Ice Shelf. All values are significantly higher than the Steig reconstruction, and we show much smaller regions with negative values.
As an aside, the r^2 plots are not corrected by the Monte Carlo analysis yet. However, as shown in the previous post concerning Steig’s verification statistics, the maximum r^2 values using AR(8) noise were only 0.019, which produces an indistinguishable change from Fig. 3.
Now that we know that our method provides a more faithful reproduction of the satellite data, it is time to see how faithfully our method reproduces the ground data. A simple way to compare ours against Steig’s is to look at scatterplots of reconstructed anomalies vs. ground station anomalies:
Your browser may not support display of this image.
The 13-PC reconstruction shows significantly improved performance in predicting ground temperatures as compared to the Steig reconstruction. This improved performance is also reflected in plots of correlation coefficient:
As noted earlier, the performance in the Peninsula , West Antarctica, and the Ross Ice Shelf are noticeably better for our reconstruction. Examining the plots this way provides a good indication of the geographical performance of the two reconstructions. Another way to look at this – one that allows a bit more precision – is to plot the results as bar plots, sorted by location:
The difference is quite striking.
While a good performance with respect to correlation is nice, this alone does not mean we have a “good” reconstruction. One common problem is over-fitting during the calibration period (where the calibration period is defined as the periods over which actual data is present). This leads to fantastic verification statistics during calibration, but results in poor performance outside of that period.
This is the purpose of the restricted reconstruction, where we withhold all AWS data. We then compare the reconstruction values against the actual AWS data. If our method resulted in overfitting (or is simply a poor method), our verification performance will be correspondingly poor.
Since Steig did not use AWS stations for performing his TIR reconstruction, this allows us to do an apples-to-apples comparison between the two methods. We can use the AWS stations as a verification target for both reconstructions. We can then compare which reconstruction results in better performance from the standpoint of being able to predict the actual AWS data. This is nice because it prevents us from later being accused of holding the reconstructions to different standards.
Note that since all of the AWS data was withheld, RE is undefined. RE uses the calibration period mean, and there is no calibration period for the AWS stations because we did the reconstruction without including any AWS data. We could run a split test like we did with the satellite data, but that would require additional calculations and is an easier test to pass regardless. Besides, the reason we have to run a split test with the satellite data is that we cannot withhold all of the satellite data and still be able to do the reconstruction. With the AWS stations, however, we are not subject to the same restriction.
With that, I think we can safely put to bed the possibility that our calibration performance was due to overfitting. The verification performance is quite good, with the exception of one station in West Antarctica (Siple). Some of you may be curious about Siple, so I decided to plot both the original data and the reconstructed data. The problem with Siple is clearly the short record length and strange temperature swings (in excess of 10 degrees), which may indicate problems with the measurements:
While we should still be curious about Siple, we also would not be unjustified in considering it an outlier given the performance of our reconstruction at the remainder of the station locations.
Leaving Siple for the moment, let’s take a look at how Steig’s reconstruction performs.
Not too bad – but not as good as ours. Curiously, Siple does not look like an outlier in Steig’s reconstruction. In its place, however, seems to be the entire Peninsula. Overall, the correlation coefficients for the Steig reconstruction are poorer than ours. This allows us to conclude that our reconstruction more accurately calculated the temperature in the locations where we withheld real data.
Along with correlation coefficient, the other statistic we need to look at is CE. Of the three statistics used by Steig – r, RE, and CE – CE is the most difficult statistic to pass. This is another reason why we are not concerned about lack of RE in this case: RE is an easier test to pass.
Your browser may not support display of this image.
The difference in performance between the two reconstructions is more apparent in the CE statistic. Steig’s reconstruction demonstrates negligible skill in the Peninsula, while our skill in the Peninsula is much higher. With the exception of Siple, our West Antarctic stations perform comparably. For the rest of the continent, our CE statistics are significantly higher than Steig’s – and we have no negative CE values.
So in a test of which method best reproduces withheld ground station data, our reconstruction shows significantly more skill than Steig’s.
The final set of statistics we will look at is the performance of RegEM. This is important because it will show us how faithful RegEM was to the original data. Steig did not perform any verification similar to this because PTTLS does not return the model frame. Unlike PTTLS, however, our version of RegEM (IPCA) does return the model frame. Since the model frame is accessible, it is incumbent upon us to look at it.
Note: In order to have a comparison, we will run a Steig-type reconstruction using RegEM IPCA.
There are two key statistics for this: r and R^2. R^2 is called “average explained variance”. It is a similar statistic to RE and CE with the difference being that the original data comes from the calibration period instead of the verification period. In the case of RegEM, all of the original data is technically “calibration period”, which is why we do not calculate RE and CE. Those are verification period statistics.
Let’s look at how RegEM IPCA performed for our reconstruction vs. Steig’s.
As you can see, RegEM performed quite faithfully with respect to the original data. This is a double-edged sword; if RegEM performs too faithfully, you end up with overfitting problems. However, we already checked for overfitting using our restricted reconstruction (with the AWS stations as the verification target).
While we had used regpar settings of 9 (main reconstruction) and 6 (restricted reconstruction), Steig only used a regpar setting of 3. This leads us to question whether that setting was sufficient for RegEM to be able to faithfully represent the original data. The only way to tell is to look, and the next frame shows us that Steig’s performance was significantly less than ours.
Fig. 14: Correlation coefficient between RegEM model frame and actual ground data, Steig reconstructionThe performance using a regpar setting of 3 is noticeably worse, especially in East Antarctica. This would indicate that a setting of 3 does not provide enough degrees of freedom for the imputation to accurately represent the existing data. And if the imputation cannot accurately represent the existing data, then its representation of missing data is correspondingly suspect.
Another point I would like to note is the heavy weighting of Peninsula and open-ocean stations. Steig’s reconstruction relied on a total of 5 stations in West Antarctica, 4 of which are located on the eastern and southern edges of the continent at the Ross Ice Shelf. The resolution of West Antarctic trends based on the ground stations alone is rather poor.
Now that we’ve looked at correlation coefficients, let’s look at a more stringent statistic: average explained variance, or R^2.
Using a regpar setting of 9 also provides good R^2 statistics. The Peninsula is still a bit wanting. I checked the R^2 for the 21-PC reconstruction and the numbers were nearly identical. Without increasing the regpar setting and running the risk of overfitting, this seems to be about the limit of the imputation accuracy.
Steig’s reconstruction, on the other hand, shows some fairly low values for R^2. The Peninsula is an odd mix of high and low values, West Antarctica and Ross are middling, while East Antarctica is poor overall. This fits with the qualitative observation that the Steig method seemed to spread the Peninsula warming all over the continent, including into East Antarctica – which by most other accounts is cooling slightly, not warming.
CONCLUSION
With the exception of the RegEM verification, all of the verification statistics listed above were performed exactly (split reconstruction) or analogously (restricted 15 predictor reconstruction) by Steig in the Nature paper. In all cases, our reconstruction shows significantly more skill than the Steig reconstruction. So if these are the metrics by which we are to judge this type of reconstruction, ours is objectively superior.
As before, I would qualify this by saying that not all of the errors and uncertainties have been quantified yet, so I’m not comfortable putting a ton of stock into any of these reconstructions. However, I am perfectly comfortable saying that Steig’s reconstruction is not a faithful representation of Antarctic temperatures over the past 50 years and that ours is closer to the mark.
NOTE ON THE SCRIPT
If you want to duplicate all of the figures above, I would recommend letting the entire script run. Be patient; it takes about 20 minutes. While this may seem long, remember that it is performing 11 different reconstructions and calculating a metric butt-ton of verification statistics.
There is a plotting section at the end that has examples of all of the above plots (to make it easier for you to understand how the custom plotting functions work) and it also contains indices and explanations for the reconstructions, variables, and statistics. As always, though, if you have any questions or find a feature that doesn’t work, let me know and I’ll do my best to help.
Lastly, once you get comfortable with the script, you can probably avoid running all the reconstructions. They take up a lot of memory, and if you let all of them run, you’ll have enough room for maybe 2 or 3 more before R refuses to comply. So if you want to play around with the different RegEM variants, numbers of included PCs, and regpar settings, I would recommend getting comfortable with the script and then loading up just the functions. That will give you plenty of memory for 15 or so reconstructions.
As a bonus, I included the reconstruction that takes the output of our reconstruction, uses it for input to the Steig method, and spits out this result:
The name for the list containing all the information and trends is “r.3.test”.
—————————————————————-
Code is here Recon.R
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
















Lee Kington (18:36:59) :
You are of course correct.
This review, paper or whatever you want to call it is open to review with all data & methods open to make it replicable.
This is in stark contrast to the gang-green who hide their methods & data in an effort to bamboozle.
I cannot lay claim to gang-green but forget to whom it is attributable.
DaveE
Look no further than WUWT. http://earthobservatory.nasa.gov/images/imagerecords/6000/6502/antarctic_temps.AVH1982-2004.jpg
I was trying to find West Antarctic’s contribution to circumpolar warming. The clockwise flow viewed from the Pole would seem to indicate that the Ross ice shelf is protected from ocean warming at least from that source, hence its growth over the last 30years.
Sir, I must object.
Despite the simple fact that this article is about the south polar temperatures and despite the fact that it has images of heat and cold, you refused to take my fine advice and title such an article “Halle Berry hot images of Antarctica”.
So here we are with adverts for some Centerspace Math Lab library, when for darned near no effort on your part we could have ads for Halle Berry
Clearly you care not one whit for your audience.
How could you!
Keith Minto (20:03:33) :
It needs to include the term that they throw at us, ‘deniers’. It has been used before but ‘Natural Climate Change Deniers’ is clumsy but on the right track.
It’s just the way you say it.
Fire Hot!
Ice Cold!
Climate Change!
Maybe add a picture of this guy.
http://redwoodr.files.wordpress.com/2008/05/caveman.jpg
Ryan, I would have to agree with George. I would suggest you consider a slight revision here. Your work here is fantastic! And, it deserves to utilize the full brunt of the “significance” of your findings. Please consider George’s words, I believe he is on to something important here and I would like to see you gain from the full impact of this fine piece. I see nothing insignificant about this writing at all, it is truly a very fine piece of work!
my 2 cents
I think “eco-Al-Qaeda” has a better ring to it. Just a bunch of Osama-Bin-Warmers … lol
Insisters
Anthony’s REPLY to Benjamin P. @ur momisugly 07:24:49
Where or who is the official climate zone designator? As far as I know someone writes a few papers and then publishes a map with their view of the world. Just because two or three of these have been done and used repeatedly for explanatory and teaching purposes does not make them correct or official. None were focusing on Antarctica when they did their work.
So design you map. Get a fresh design by a computer pro and e-mail to the Association of American Geographers and ask them to send the link out to all their specialty groups that might be interested. Then send it to all those with current earth science textbooks on the market and give them permission to use it in the next edition.
Task accomplished.
Nasif Nahle (16:56:57) :
I just was trying to point to the nature of Steig’s error because you said that it was only a measurement, when the interpretation of that measurement was involved.
No, not at all. There is no interpretation when it comes to TSI. There is simply an not understood measurement error. The Steig analogy would be that some temperatures were measured wrong in the first place, perhaps with thermometers that were leaking or such. In such a situation you correct the error once you have decided what it is and how best to correct it. You do not interpret the faulty data once you know they are faulty.
I like Gullible Warmers. (I don’t know who came up with that one.)
As others have mentioned already, they should not be let off with the fact that they all started out as global warmers but are now climate changers. They must have started covering their behinds when they realised that things were starting to cool down.
Stephen Brown (13:08:41) :
“Chris S (04:50:12) :
I await the publication of this after peer review with baited breath”
Erm … Wouldn’t the word “bated” be more appropriate? It’s breathing you are talking about, not fishing!
Your friendly nit-picking pedant.”
As you-all have led us off topic may I suggest you read this:
http://www.worldwidewords.org/qa/qa-bai1.htm
Sorry, but I meant to include this:
“Cruel Clever Cat:
Sally, having swallowed cheese,
Directs down holes the scented breeze,
Enticing thus with baited breath
Nice mice to an untimely death.”
wonder why CT is showing this
http://arctic.atmos.uiuc.edu/cryosphere/NEWIMAGES/arctic.seaice.color.000.png (which seem fine)
but not tracing for some time now (15th feb) eyeballing methinks this is still WAY ABOVE last years and probably mean by now
HavocHounds, WarmingWorriers, DoomsdayDupes, DoomsdayDogs, ScorchedEarthers, CalamityPushers, Cryophiles, WarmSwogglers, PanicPeddlers…
Appropriationers, PanicProfiteers, WarmingExploiters, PanicRacketeers, NeoPuritans, NeoCarboCons, ClimateConformists, Mind-Warmed-Robots, HelplessSheepPeople, SheepPeople, GoreLackeys, HotEarthEmos…
Leif Svalgaard (16:10:43) : “[TSI] does not deviate…The best modern measurements from TIM on SORCE is about 5 W.m^2 below the ‘normal’ TSI of 1366 because of instrumental differences. The ‘relative error’ [that is the error on any given instrument compared to earlier values from the same instrument] is almost a thousand times better, at the 0.007 W/m^2 level. We know the variation of the TSI value over time better than 0.1 W/m^2.”
Then if the error is of only 0.1 W/m^2, then it should be 1366.0 W/m^2.
Chip,
I can only suggest the bleedin obvious:
The “Prophets of Doom” (or Pod if you like).
BTW wuwters pls do not rubbish us poor geriatric cruisers. Most have a very broad view of the world and many have seen places that some wuwters could only dream of.
I think is is also ironic that a cruise ship sank not too long ago in the Antarctic by hitting ice, amazingly without any loss of life! Had there been deaths, I think it would have been a wake up call for people to realise that it is not a place that you go snorkelling:
http://news.bbc.co.uk/2/hi/americas/7108835.stm
My 2 cents, for what it is worth. I have a MS in stats. I follow the calculations. But I do not care for Principal Component Analysis. I do not believe reliable inferences can be made using that methodology.
The magnitude of the alleged temperature change, whether Steig’s or Ryan’s, is below detectable limits. The measurement error exceeds hundredths of a degree per decade.
Lack of statistical significance means that the statistic is not different from zero. The actual change may be positive or negative. It therefore remains highly debatable whether Antarctica has warmed, or cooled, over the period analyzed.
All reconstructions had a positive sign. That would be a potentially significant finding, if the values were different than zero (that is, if they exceeded the measurement error). But they do not. In my opinion, the best that can be said is that using the (questionable) method, no trend was detected.
Warming is neither confirmed nor rejected. There is no smoking gun one way or the other. Uncertainty prevails. Nobody’s agenda is served; nothing is bunked or debunked. In my opinion.
But kudos for the attempt.
[snip–too Nazi]
18,000 yr into our interglacial,and Steig et al can “Robustly” (within a 50 yr tick of climate time,using 42 stations) show climate change. CONGRATULATIONS! Climate change -just like the whole history of earth.
Because everything is connected. We are one connected people, and if they are suffering over there, it is because of our individual actions over here. Or at least that’s the grand-narrative/picture they want us to see.
But whilst it is true that we are all connected, that doesn’t mean the effects on another part of the planet can be traced and linked back to my individual actions.
If everyone’s individual actions make a difference and have an effect on everything and everyone else, that means everybody’s actions are combined in unpredictable and complex ways, which means that no global effect can be traced back to anyone in particular. So yes, my actions have effects on the whole world, but those effects are completely unpredictable.
The belief that that my actions “cause” global warming, or “cause” famines, or “cause” extinctions, has been said by psychologists to constitute a massive ego trip for the individual and their sense of omnipotence. The idea that a tiny trace gas causes global warming, is so believable because people already believe something very similar–that a tiny action on my part causes world famines. Unplug that mobile phone charger to save the planet!
The counter-argument that CO2 is just one variable in a highly complex system, is just like the argument to inflated egos, ie. that people should remember some humility. The reality is that their personal actions are just one tiny small part in the complex of billions of people and multiple ecosystems and cultures on the planet. But we’ve just got 4 years to save the planet! Most people can’t stick to a diet for that long, let alone change the course of planetary history. One would say this more often, but that would deflate the ego somewhat…
I’m nominating Ian Plimer for quote of the week:
IT is well known that many university staff list to port and try to engineer a brave new world. The cash cow climate institutes now seem to be drowning in their own self-importance.. The rest of the article can be found here:
http://www.theaustralian.news.com.au/story/0,25197,25552775-7583,00.html
PS I’m still waiting for my copy of his book. If anyone has Plimer’s e-mail address can you ask him to lean on the publishers for another reprint.
(14:29:36) :
Reply: Oy, nitpicking about nitpicking, and btw, you misspelled humorous. ~ charles the sometimes anti-semantic moderator.
Lots of anti-semanticism going around these days, and we grammar-n*zi’s fight the good fight. For the spelling errors, I recommend downloading the Firefox browser, which has a built-in spell checker.
Also sorry to hear some among us have nits.
On Topic, any chance of an explanation of what the “regpar” value does?
Anthony,as a layman, with only a very basic science background, it seems to me that efforts on this site and others have shot many holes in the hot air balloon and it must be in deep trouble.
What exactly does it take to appeal to those with a science background who still feel that our influence on the climate will have catastrophic results ?
What is the state of the play ?
I know that the general public put global warming/climate change very low on their list of concerns, but who speaks for science and how does a layman know which way the scientific community is bending ?
Let me make two philo-semantic observations.
1. To wait with ‘bated breath’ means, literally, to be holding ones breath in anxious anticipation. The verb ‘bate’, which has become obsolete except in this expression, has a living first cousin in the verb ‘abate’, meaning ‘lessen’.
2. The Latin abbreviation ‘et al’ stands for ‘et alii’, ‘and others’. In academia, these ‘alii’ are usually indentured servants known as graduate students who have done most of the gruntwork for the Immortals who are recognized by name as the authors.
I was too busy yesterday to keep up, so this morning I start reading from the end forward, and was very intrigued by the discussion over “statistically significant” attributed to a comment from Leif. Eventually, I got to it.
Leif Svalgaard (10:20:43) :
while I generally agree with Ryan’s result and think his post is very good, I do have a slight problem with this statement:
“Overall, Antarctica has warmed from 1957-2006. There is no debating that point. (However, other than the Peninsula, the warming is not statistically significant)”
A measurement is a measurement and is not in itself ’statistically significant’ or not. If I measure the temperature outside and it is 67.2F, that is what it is and it carries no statistical significance as that is a concept that does not apply here. The significance comes in if you compare the measured value to its ‘expected’ value and want to argue that it is significantly different than the observed spread in such differences. So, one may ask what the expected value for the Antarctic would be and what the observed spread is.
Leif, I understand your basic point, but question whether it is the whole story here. Over time, temperature varies. Sometimes we think we see a trend in the variation. It is commonplace to measure whether the trend, or change, is “statistically significant” by comparison to an “expected value” based on the mean. When I read a statement like “However, other than the Peninsula, the warming is not statistically significant” I would assume that Ryan simply meant that any trend is not significantly different than the mean (a flat line).
Without going back and looking to see if this is truly the case or not, my impression from a quick read was that by lowering the computed change, we were now down to a number not significantly different than the mean.