Steig's Antarctic Heartburn

Art courtesy Dave Stephens

Foreword by Anthony Watts: This article, written by the two Jeffs (Jeff C and Jeff Id) is one of the more technically complex essays ever presented on WUWT. It has been several days in the making. One of the goals I have with WUWT is to make sometimes difficult to understand science understandable to a wider audience. In this case the statistical analysis is rather difficult for the layman to comprehend, but I asked for (and got) an essay that was explained in terms I think many can grasp and understand. That being said, it is a long article, and you may have to read it more than once to fully grasp what has been presented here. Steve McIntyre of Climate Audit laid much of the ground work for this essay, and from his work as well as this essay, it is becoming clearer that Steig et al (see “Warming of the Antarctic ice-sheet surface since the 1957 International Geophysical Year”, Nature, Jan 22, 2009) isn’t holding up well to rigorous tests as demonstrated by McIntyre as well as in the essay below. Unfortunately, Steig’s office has so far deferred (several requests) to provide the complete data sets needed to replicate and test his paper, and has left on a trip to Antarctica and the remaining data is not “expected” to be available until his return.

To help layman readers understand the terminology used, here is a mini-glossary in advance:

RegEM – Regularized Expectation Maximization

PCA – Principal Components Analysis

PC – Principal Components

AWS – Automatic Weather Stations

One of the more difficult concepts is RegEM, an algorithm developed by Tapio Schneider in 2001. It’s a form of expectation maximization (EM) which is a common and well understood method for infilling missing data. As we’ve previously noted on WUWT, many of the weather stations used in the Steig et al study had issues with being buried by snow, causing significant data gaps in the Antarctic record and in some burial cases stations have been accidentally lost or confused with others at different lat/lons. Then of course there is the problem of coming up with trends for the entire Antarctic continent when most of the weather station data is from the periphery and the penisula, with very little data from the interior.

Expectation Maximization is a method which uses a normal distribution to compute the best probability of fit to a missing piece of data. Regularization is required when so much data is missing that the EM method won’t solve. That makes it a statistically dangerous technique to use and as Kevin Trenberth, climate analysis chief at the National Center for Atmospheric Research, said in an e-mail: “It is hard to make data where none exist.” (Source: MSNBC article) It is also valuable to note that one of the co-authors of Steig et al, Dr. Michael Mann, dabbles quite a bit in RegEm in this preparatory paper to Mann et al 2008 “Return of the Hockey Stick”.

For those that prefer to print and read, I’ve made a PDF file of this article available here.

Introduction

This article is an attempt to describe some of the early results from the Antarctic reconstruction recently published on the cover of Nature which demonstrated a warming trend in the Antarctic since 1956. Actual surface temperatures in the Antarctic are hard to come by with only about 30 stations prior to 1980 recorded through tedious and difficult efforts by scientists in the region. In the 80’s more stations were added including some automatic weather stations (AWS) which sit in remote areas and report the temperature information automatically. Unfortunately due to the harsh conditions in the region many of these stations have gaps in their records or very short reporting times (a few years in some cases). Very few stations are located in the interior of the Antarctic, leaving the trend for the central portion of the continent relatively unknown. The location of the stations is shown on the map below.

In addition to the stations there are satellite data from an infrared surface temperature measurement which records the temperature of the actual emission from the surface of the ice/ground in the Antarctic. This is different from the microwave absorption measurements as made from UAH/RSS data which measure temperatures in a thickness of the atmosphere. This dataset didn’t start until 1982.

Steig 09 is an attempt to reconstruct the continent-wide temperatures using a combination of measurements from the surface stations shown above and the post-1982 satellite data. The complex math behind the paper is an attempt to ‘paste’ the 30ish pre-1982 real surface station measurements onto 5509 individual gridcells from the satellite data. An engineer or vision system designer could use several straightforward methods which would insure reasonable distribution of the trends across the grid based on a huge variety of area weighting algorithms, the accuracy of any of the methods would depend on the amount of data available. These well understood methods were ignored in Steig09 in favor of RegEM.

The use of Principal Component Analysis in the reconstruction

Steig 09 presents the satellite reconstructions as the trend and also provides an AWS reconstruction as verification of the satellite data rather than a separate stand alone result presumably due to the sparseness of the actual data. An algorithm called RegEM was used for infilling the missing data. Missing data includes pre 1982 for satellites and all years for the very sparse AWS data. While Dr. Steig has provided the reconstructions to the public, he has declined to provide any of the satellite, station or AWS temperature measurements used as inputs to the RegEM algorithm. Since the station and AWS measurements were available through other sources, this paper focuses on the AWS reconstruction.

Without getting into the detail of PCA analysis, the algorithm uses covariance to assign weighting of a pattern in the data and does not have any input whatsoever for actual station location. In other words, the algorithm has no knowledge of the distance between stations and must infill missing data based solely on the correlation with other data sets. This means there is a possibility that with improper or incomplete checks, a trend from the peninsula on the west coast could be applied all the way to the east. The only control is the correlation of one temperature measurement to another.

If you were an engineer concerned with the quality of your result, you would recognize the possibility of accidental mismatch and do a reasonable amount of checking to insure that the stations were properly assigned after infilling. Steig et. al. described no attempts to check this basic potential problem with RegEM analysis. This paper will describe a simple method we used to determine that the AWS reconstruction is rife with spurious (i.e. appear real but really aren’t) correlations attributed to the methods used by Dr. Steig. These spurious correlations can take a localized climactic pattern and “smear” it over a large region that lacks adequate data of its own.

Now is where it becomes a little tricky. RegEM uses a reduced information dataset to infill the missing values. The dataset is reduced by Principal Component Analysis (PCA) replacing each trend with a similar looking one which is used for covariance analysis. Think of it like a data compression algorithm for a picture which uses less computer memory than the actual but results in a fuzzier image for higher compression levels.

While the second image is still visible, the actual data used to represent the image is reduced considerably. This will work fine for pictures with reasonable compression, but the data from some pixels has blended into others. Steig 09 uses 3 trends to represent all of the data in the Antarctic. In it’s full complexity using 3 PC’s is analogous to representing not just a picture but actually a movie of the Antarctic with three color ‘trends’ where the color of each pixel changes according to different weights of the same red, green and blue color trends (PC’s). With enough PC’s the movie could be replicated perfectly with no loss. Here’s an important quote from the paper.

“We therefore used the RegEM algorithm with a cut-off parameter K=3. A disadvantage of excluding higher-order terms (k>3) is that this fails to fully capture the variance in the Antarctic Peninsula region. We accept this tradeoff because the Peninsula is already the best-observed region of the Antarctic.”

Above: a graph from Steve McIntyre of ClimateAudit where he demonstrates how “K=3 was in fact a fortuitous choice, as this proved to yield the maximum AWS trend, something that will, I’m sure, astonish most CA readers.”

K=3 means only 3 trends were used, the ‘lack of captured variance’ is an acknowledgement and acceptance of the fuzziness of the image. It’s easy to imagine that it would be difficult to represent a complex movie image of Antarctic with any sharpness from 1957 to 2006 temperature with the same 3 color trends reweighted for every pixel. In the satellite version of the Antarctic movie the three trends look like this.

Note that the sudden step in the 3^rd trend would cause a jump in the ‘temperature’ of the entire movie. This represents the temperature change between the pre 1982 recreated data and the after 1982 real data in the satellite reconstruction. This is a strong yet overlooked hint that something may not be right with the result.

In the case of the AWS reconstruction we have only 63 AWS stations to make the movie screen, by which the trends of 42 surface station points are used to infill the remaining data. If the data from one surface station is copied to the wrong AWS stations the average will overweight and underweight some trends. So the question becomes, is the compression level too high?

The problems that arise when using too few principal components

Fortunately, we’re here to help in this matter. Steve McIntyre again provided the answer with a simple plot of the actual surface station data correlation with distance. This correlation plot compares the similarities ‘correlation’ of each temperature station with all of the 41 other manual surface stations against the distance between them. A correlation of 1 means the data from one station is exactly equal to the other. Because A -> B correlation isn’t a perfect match for B->A there are 42*42 separate points in the graph. This first scatter plot is from measured temperature data prior to any infilling of missing measurements. Station to station distance is shown on the X axis. The correlation coefficient is shown on the Y axis.

Since this plot above represents the only real data we have existing back to 1957, it demonstrates the expected ‘natural’ spatial relationship from any properly controlled RegEM analysis. The correlation drops with distance which we would expect because temps from stations thousands of miles away should be less related than those next to each other. (Note that there are a few stations that show a positive correlation beyond 6000 km. These are entirely from non-continental northern islands inexplicably used by Steig in the reconstruction. No continental stations exhibit positive correlations at these distances.) If RegEM works, the reconstructed RegEM imputed (infilled) data correlation vs. distance should have a very similar pattern to the real data. Here’s a graph of the AWS reconstruction with infilled temperature values.

Compare this plot with the previous plot from actual measured temperatures. Now contrast that with the AWS plot above. The infilled AWS reconstruction has no clearly evident pattern of decay over distance. In fact, many of the stations show a correlation of close to 1 for stations at 3000 km distant! The measured station data is our best indicator of true Antarctic trends and it shows no sign that these long distance correlations occur. Of course, common sense should also make one suspicious of these long distance correlations as they would be comparable to data that indicated Los Angeles and Chicago had closely correlated climate.

It was earlier mentioned that the use of 3 PCs was analogous to the loss of detail that occurs in data compressions. Since the AWS input data is available, it is possible to regenerate the AWS reconstruction using a higher number of PCs. It stood to reason that spurious correlations could be reduced by retaining the spatial detail lost in the 3 PC reconstruction. Using RegEM, we generated a new AWS reconstruction using the same input data but with 7 PCs. The distance correlations are shown in the plot below.

Note the dramatic improvement over that shown in the previous plot. The correlation decay with distance so clearly seen in the measured station temperature data has returned. While the cone of the RegEM data is slightly wider than the ‘real’ surface station data, the counterintuitive long distance correlations seen in the Steig reconstruction have completely disappeared. It seems clear that limiting the reconstruction to 3 PCs resulted in numerous spurious correlations when infilling missing station data.

Using only 3 principal components distorts temperature trends

If Antarctica had uniform temperature trends across the continent, the spurious correlations might not have a large impact in the overall reconstruction. Individual sites may have some errors, but the overall trend would be reasonably close. However, Antarctica is anything but uniform. The spurious correlations can allow unique climactic trends from a localized region to be spread over a larger area, particularly if an area lacks detailed climate records of its own. It is our conclusion is that is exactly what is happening with the Steig AWS reconstruction.

Consider the case of the Antarctic Peninsula:

The peninsula is geographically isolated from the rest of the continent
The peninsula is less than 5% of the total continental land mass
The peninsula is known to be warming at a rate much higher than anywhere else in Antarctica
The peninsula is bordered by a vast area known as West Antarctica that has extremely limited temperature records of its own
15 of the 42 temperature surface stations (35%) used in the reconstruction are located on the peninsula

If the Steig AWS reconstruction was properly correlating the peninsula stations temperature measurements to the AWS sites, you would expect to see the highest rates of warming at the peninsula extremes. This is the pattern seen in the measured station data. The plot below shows the temperature trends for the reconstructed AWS sites for the period of 1980 to 2006. This time frame has been selected as this is the period when AWS data exists. Prior to 1980, 100% of AWS reconstructed data is artificial (i.e. infilled by RegEM).

Note how warming extends beyond the peninsula extremes down toward West Antarctica and the South Pole. Also note the relatively moderate cooling in the vicinity of the Ross Ice Shelf (bottom of the plot). The warming once thought to be limited to the peninsula appears to have spread. This “smearing” of the peninsula warming has also moderated the cooling of the Ross Ice Shelf AWS measurements. These are both artifacts of limiting the reconstruction to 3 PCs.

Now compare the above plot to the new AWS reconstruction using 7 PCs.

The difference is striking. The peninsula has become warmer and warming is largely limited to its confines. West Antarctica and the Ross Ice Shelf area have become noticeably cooler. This agrees with the commonly-held belief prior to Steig’s paper that the peninsula is warming, the rest of Antarctica is not.

Temperature trends using more traditional methods

In providing a continental trend for Antarctica warming, Steig used a simple average of the 63 AWS reconstructed time series. As can be seen in the plots above, the AWS stations are heavily weighted toward the peninsula and the Ross Ice Shelf area. Steig’s simple average is shown below. The linear trend for 1957 through 2006 is +0.14 deg C/decade. It is worth noting that if the time frame is limited to 1980 to 2006 (the period of actual AWS measurements), the trend changes to cooling, -0.06 deg C/decade.

We used a gridding methodology to weight the AWS reconstructions in proportion to the area they represent. Using the Steig’s method, 3 stations on the peninsula over 5% of the continent’s area would have the same weighting as three interior stations spread over 30% of the continent area. The gridding method we used is comparable to that utilized in other temperature constructions such as James Hansen’s GISStemp. The gridcell map used for the weighted 7 PC reconstruction is shown here.

Cells with a single letter contain one or more AWS temperature stations. If more than one AWS falls within a gridcell, the results were averaged and assigned to that cell. Cells with multiple letters had no AWS within them, but had three or more contiguous cells containing AWS stations. Imputed temperature time series were assigned to these cells based on the average of the neighboring cells. Temperature trends were calculated both with and without the imputed cells. The reconstruction trend using 7 PCs and a weighted station average follow.

The trend has decreased to 0.08 deg C/decade. Although it is not readily apparent in this plot, from 1980 to 2006 the temperature profile has a pronounced negative trend.

Temporal smearing problems caused by too few PCs?

The temperature trends using the various reconstruction methods are shown in the table below. We have broken the trends down into three time periods; 1957 to 2006, 1957 to 1979, and 1980 to 2006. The time frames are not arbitrarily chosen, but mark an important distinction in the AWS reconstructions. There is no AWS data prior to 1980. In the 1957 to 1980 time frame, every single temperature point is a product of the RegEM algorithm. In the 1980 to 2006 time frame, AWS data exists (albeit quite spotty at times) and RegEM leaves the existing data intact while infilling the missing data.

We highlight this distinction as limiting the reconstruction to 3 PCs has an additional pernicious effect beyond spatial smearing of the peninsula warming. In the table below, note the balance between the trends of the 1957 to 1979 era vs. that of the 1980 to 2006 era. In Steig’s 3 PC reconstruction, moderate warming that happened prior to 1980 is more balanced with slight cooling that happened post 1980. In the new 7 PC reconstruction, the early era had dramatic warming, the later era had strong cooling. It is believed that the 7 PC reconstruction more accurately reflects the true trends for the reasons stated earlier in this paper. However, the mechanism for this temporal smearing of trends is not fully understood and is under investigation. It does appear to be clear that limiting the selection to three principal components causes warming that is largely constrained to a pre-1980 time frame to appear more continuous and evenly distributed over the entire temperature record.

Reconstruction	1957 to 2006 trend	1957 to 1979 trend (pre-AWS)	1980 to 2006 trend (AWS era)
Steig 3 PC	+0.14 deg C./decade	+0.17 deg C./decade	-0.06 deg C./decade
New 7 PC	+0.11 deg C./decade	+0.25 deg C./decade	-0.20 deg C./decade
New 7 PC weighted	+0.09 deg C./decade	+0.22 deg C./decade	-0.20 deg C./decade
New 7 PC wgtd imputed cells	+0.08 deg C./decade	+0.22 deg C./decade	-0.21 deg C./decade

Conclusion

The AWS trends which this incredibly long post was created from were used only as verification of the satellite data. The statistics used for verification are another subject entirely. Where Steig09 falls short in the verification is that RegEM was inappropriately applying area weighting to individual temperature stations. The trends from the AWS reconstruction clearly have blended into distant stations creating an artificially high warming result. The RegEM methodology also appears to have blended warming that occurred decades ago into more recent years to present a misleading picture of continuous warming. It should also be noted that every attempt made to restore detail to the reconstruction or weight station data resulted in reduced warming and increased cooling in recent years. None of these methods resulted in more warming than that shown by Steig.

We don’t yet have the satellite data (Steig has not provided it) so the argument will be:

“Silly Jeff’s you haven’t shown anything, the AWS wasn’t the conclusion it was the confirmation.”

To that we reply with an interesting distance correlation graph of the satellite reconstruction (also from only 3 PCs). The conclusion has the exact same problem as the confirmation. Stay tuned.

(Graph originally calculated by Steve McIntyre)

0 0 votes

Article Rating

135 Comments

Inline Feedbacks

View all comments

Richard111

February 28, 2009 11:45 pm

Any chance of a PDF file pretty please?
REPLY: Here you go, hot off the press, just for you. – Anthony
http://wattsupwiththat.files.wordpress.com/2009/03/steigs-antarctic-heartburn-wuwt-022809.pdf

rephelan

Editor

February 28, 2009 11:58 pm

Jeff & Jeff
I left my graduate program in 1973 because they seemed to have a fondness for what I called 99X99 Sociology – I think they call it “data-mining” today. My statistics were always atrocious…. but I think I follow your argument here. What I can’t quite get, were there any “real” observations left in the analysis or were the results based all on smoothed and filled-in numbers? What happened if a station had a value outside the range of neighboring grids? I gather that the RegEm process is iterative?

Bruce Cunningham

March 1, 2009 12:06 am

I can see another Wegman report , along with a dozen (cough cough) “independent” studies confirming Steig 09 in the future. Tighten your seat belts.

E.M.Smith

Editor

March 1, 2009 12:06 am

So much data to fabricate, so little time. Why can’t these folks (Steig, Hansen) just use the real data?
Jeff & Jeff, thank you. A wonderful exposition.
FWIW, I think a very similar thing happens in GIStemp in that the recursive application of “The Reference Station Method” will cause blurring of one climate zone into another (in particular, the raising of temperatures on the coasts by comparing them with the interior ‘reference stations’ to adjust for UHI. Coasts are heavily populated, so the rural stations will tend to be inland, and inland tends to be more volatile, yet a simple subtraction is done rather than a comparative slope or correlation coefficient). It would be fascinating to see the same distance correlation plots done on the “raw” NOAA data and GIStemp processed temps.

March 1, 2009 12:11 am

Unless Steig provides the requested data, Nature should withdraw this work or even that Issue/Volume. I certainly will not be submitting any work to this journal in the futute if this is the way they referee submissions.

Richard111

March 1, 2009 12:17 am

Gee! Thanks. Sixteen pages, just like that!
(My wife won’t thanks you 🙂 )

Lindsay H

March 1, 2009 12:25 am

a very nice piece of analysis, very good work

Leon Brozyna

March 1, 2009 12:26 am

A quick comment after reading through the forward; I’ll read the rest in the morning after I’m fully awake and have digested my scrambled eggs.
I tried wading through the analysis done on CA and, while I think I apprehended what was done there, I’m looking forward to this write-up.
The number of problems that surfaced after being peer-reviewed published in Nature highlights the problem with the so-called peer review process. If all the peers reviewing a paper accept the premise and conclusion already, such as global warming, they’re most likely to just scan quickly over a paper before giving it their blessing, as appears to have happened with Steig et al. This is the sort of problem I recall that was addressed in the Wegman report to Congress regarding the “hockey stick.” How embarrassing it must be to have a peer-reviewed paper published and then to have significant flaws pointed out, flaws which should have been found in a robust peer review process.

Juraj V.

March 1, 2009 12:26 am

What happened with the Harry station? Which data were used for it in this reconstruction?

Phillip Bratby

March 1, 2009 12:29 am

Stunning. Thanks Jeffs.

Cold Englishman

March 1, 2009 12:42 am

Other side of the world, but here we go again, isn’t it time the bbc stopped this biassed reporting, they start with the solution, now they’re going to prove it. I want these individuals in 2010 to be paraded by the bbc to give an explanation of why the arctic is still frozen. The world has truly gone barmy!
http://news.bbc.co.uk/1/hi/sci/tech/7917266.stm

Manfred

March 1, 2009 1:00 am

antarctica cooling since at least 1980 is in good agreement with the record sea ice levels in recent years.
the pensinsula appears to become the last resort for the AGW crowd.
when the arctic ice recovers, i expect big tavel activity to the peninsula from politicians, press, sponsered cruises for the noisiest scientists and maybe a cayaker.

March 1, 2009 1:04 am

Are we sure both sides of the debate are looking at the same defination? AWS could very well stand for automatic warming station.

Annabelle

March 1, 2009 1:28 am

Thanks so much Jeff and Jeff. I’ve been trying to follow the discussion at CA but lost the plot. This explains a lot.
I’m definitely staying tuned.

vivendi

March 1, 2009 1:57 am

Anthony, thanks for providing this clear, easy to understand explanation of a subject difficult to understand by non-experts. I was able to understand some of the conclusions in Steve’s and Jeff^2’s publications, but since I didn’t understand all the terms and the methods, I could get a good grip. By just spending 20 minutes in reading this article, I was able to brush up and complement my basic knowledge.

Adam Gallon

March 1, 2009 2:13 am

Let me see if I’ve got this right.
1) Everyone can agree that the Antarctic has warmed between 1957 & 2006, the amount is the question?.
2) That warming is confined to the 1957 – 1979 period, the amount is near enough the same no matter what methodology is applied to calculate it?.
3) 1980 – 2006 shows a cooling, again the amount is the question?.
There are lies, damned lies & statistics?

Ceolfrith

March 1, 2009 2:25 am

Of Topic, sort of
I have got into the habit of checking the images t http://igloo.atmos.uiuc.edu/cgi-bin/test/print.sh?fm=08 once a week out of curiosity.
I have found today that I can no longer access any images for 2009. Anyone know why they’ve closed all access even with the disclaimer on the site?

M White

March 1, 2009 2:28 am

“I want these individuals in 2010 to be paraded by the bbc to give an explanation of why the arctic is still frozen”
We may have to wait a bit longer
“Currently, he has it down for 2013 – but with an uncertainty range between 2010 and 2016.”
http://news.bbc.co.uk/1/hi/sci/tech/7917266.stm
From Arctic ice modeller Professor Wieslaw Maslowski

michel

March 1, 2009 2:30 am

Once again we come up against the basic question: why will you not release the satellite data?

Jeff Id

March 1, 2009 2:40 am

Since originally finishing these calculations, I’ve calculated the video from the 3 PC’s of the Antarctic temperature anomaly according to Steig. It’s a bit difficult to interpret but I found it very interesting.
http://noconsensus.wordpress.com/2009/02/28/a-little-bit-of-magic/
—
Juraj V, the new improved Harry was used.

Mac

March 1, 2009 2:47 am

Lessons to be learnt.
1. Less data/more statistics gives us a warming trend.
2. More data/less statistics gives us a cooling trend.
The only thing that can be claimed is that since 1980 Antarctica has been cooling.

D. King

March 1, 2009 2:59 am

Wow! As you know, there are two places where sensor results
can be affected. The input, where the errors propagate through
the collection and processed results, and post collection,
where the processed results are affected by the processing.
It troubles me that your investigations are showing consistent
errors in the direction of warming. There is a third possibility for
errors, and that is the data itself is somehow being corrupted.
The hottest years of the last century and the hockey stick come
to mind for data corruption errors. These also showed a warming.
The troubling thing about the resent satellite sensor failure is that
the failure was not total. Data was still being produced and the
conclusions being drawn were in the direction of warming and loss
of sea ice area. With this new study of Antarctic AWS data processing
anomalies, which also show warming, it’s time to call this duck….
A Duck! Policies are being implemented that will impact millions
of people worldwide, some in very devastating ways. Is there no
international body of honorable scientists that can review results
and present their conclusions before these Draconian, and arguably
cruel policies are implemented?

avfuktning ventilation vind - vindsavfuktare

March 1, 2009 3:06 am

Wouldn’t it be reasonable to make hand adjustments according to what stations are from a meterological perspective supposed to be correlated? (E.g. on the same side of a rim, or in the same general circulation flow? It certainly wouldn’t be any more arbitrary than Stiegs method.

Phil

March 1, 2009 3:32 am

Great article.
It’s great that we have people like the two Jeff’s keeping track and debunking these bogus reports. I think a lot of people including myself would be lost in all the statistics that we are presented with without such informative articles. Must admit, when I heard the BBC report of the central Antarctic warming it sounded like BS.
Steig et al must be burning the midnight oil to come up with an answer to this.

captdallas2

March 1, 2009 4:19 am

Nice job. The only thing missing is the margin of error which I think is a hoot. With RegEM PC 3 they have a 95% confidence level that their trend results are accurate to +/- what was it? 55% So the Jeffs much smaller trends still fall with in the error range of the paper.
That is what I find ridiculous. Why would Nature even publish, much less put on the cover, this study?