Posted by Jeff Id on February 15, 2009
Guest post by Jeff C
Jeff Chas done an interesting and impressive recalculation of the automatic weather station AWS, reconstruction of the Steig 09 currently on the cover of Nature.
Warming of the Antarctic ice-sheet surface since the 1957 International Geophysical Year
Jeff C is an engineer who realized that the data was not weighted according to location in the original paper. He has taken the time to come up with a reasonable regridding method which more appropriately weights individual temperature stations across the Antarctic. It’s amazing that a simple, reasonable gridding of temperature stations can make so much difference to the final result.
Jeff Id’s AWS reconstructions using his implementation of RegEM are reasonably close to the Steig reconstructions. The latest difference plot between his reconstruction and Steig’s is quite impressive. Removing two sites from his reconstruction that were erroneously included in initial attempts (Gough and Marion) gives us this chart:
It is clear Jeff is very close as the plot above has virtually zero slope and the “noise level” is typically within +/- 0.3 deg C except for a few outliers (that’s the Racer Rock anomaly at the far right as we are using the original data). Although not quite fully there, it is clear Jeff has the fundamentals correct as to how Steig used the occupied station and AWS data with RegEM.
I duplicated Jeff’s results using his code and began to experiment with RegEM. As I became more familiar, it dawned on me that RegEM had no way of knowing the physical location of the temperature measurements. RegEM does not know or use the latitude and longitude of the stations when infilling, as that information is never provided to it. There is no “distance weighting” as is typically understood as RegEM has no idea how close or how far the occupied stations (the predictor) are from each other, or from the AWS sites (the predictand). Steig alludes to this in the paper on page 2:
“Unlike simple distance-weighting or similar calculations, application of RegEM takes into account temporal changes in the spatial covariance pattern, which depend on the relative importance of differing influences on Antarctic temperature at a given time.”
I’m an engineer, not a statistician so I’m not sure exactly what that means, but it sounds like hand-waving and a subtle admission there is no distance weighting. He might be saying that RegEM can draw conclusions based on the similarity in the temperature trend patterns from site to site, but that is about it. If I’ve got that wrong, I would welcome an explanation.
I plotted out the locations of the 42 occupied stations used in the reconstruction below. Note the clustering of stations on the Antarctic Peninsula. This is important because the peninsula is known to be warming, yet only constitutes a small percentage of the overall land mass (less than 5%). Despite this, 15 of the 42 occupied stations used in the reconstruction are on the peninsula.
Location of 42 occupied stations that form the READER temperature dataset (per Steig 2009 Supplemental Information). Note clustering of locations at northern extremes of the Antarctic Peninsula.
I decided to see what would happen if I applied some distance weighting to the data prior to running it through RegEM.
DISCLAIMER: I am not stating or implying that my reconstruction is the “correct” way to do it. I’m not claiming my results are any more accurate than that done by Steig. The point of this exercise is to show that RegEM does, in fact, care about the sparseness, location and weighting of the occupied station data.
I decided to carve up Antarctica into a series of grid cells. I used a triangular lattice and experimented with various cell diameters and lattice rotations. The goal was to have as many cells as possible containing occupied stations, but also to have as high a percentage of the cells as possible contain at least one occupied station. I ended up with a cell diameter of about 550 miles with the layout below.
Gridcells used for averaging and weighting. Cell diameter is approximately 550 miles. Value in parenthesis is number of occupied stations in cell. Note that cell C (northern peninsula extreme) contains 11 occupied stations, far more than other cells. Cells without letters have no occupied stations.
I sorted the occupied station data (converted to anomalies by Jeff Id’s code) into groups that corresponded to each gridcell location. If a gridcell had more than one station, I averaged the results into a single series and assigned it to the gridcell. Unfortunately, 14 of the 36 gridcells had no occupied station within them. Most of these gridcells were in the interior of the continent and covered a large percentage of the land mass. Since manufacturing data is all the rage these days, I decide to assign a temperature series to these grid cells based on the average of neighboring grid cells. The goal was to use the available temperature data to spread observed temperature trends across equal areas. For example, 17 stations on the peninsula in three grid cells would have three inputs to RegEM. Likewise, two stations in the interior over three grid cells would have three inputs to RegEM. The plot below shows my methodology.
Shaded cells with single letter contain occupied stations. Cells with two or more letters have no occupied stations but have temperature records derived from average of adjacent cells (cell letters describe cells used for derivation). Cells with derived records must have three adjacent or two non-contiguous adjacent cells with occupied stations or they are left unfilled.
I ended up with 34 gridcell temperature series. Two of the grid cells I left unfilled as I did not think there was adequate information from the adjacent gridcells to justify infilling. Once complete, I ran the 34 occupied station gridcell series through RegEM along with the 63 AWS series. The same methodology was used as in Jeff Id’s AWS reconstruction except the 42 station series were replaced by the 34 gridcell series.
For comparison, here is Steig’s AWS reconstruction:
Calculated monthly means of 63 AWS reconstructions using aws_recon.txt from Steig website. Trend is +0.138 deg C. per decade using full 1957-2006 reconstruction record. Steig 2009 states continent-wide trend is +0.12 deg C. per decade for satellite reconstruction. AWS reconstruction trend is said to be similar.
And here is my gridcell reconstruction using Jeff Id’s implementation of RegEM:
Calculated monthly means of 63 AWS reconstructions using Jeff Id RegEM implementation and averaged grid cell approach. Trend is +0.069 deg C. per decade using full 1957-2006 reconstruction record.
Although the plots are similar, the gridcell reconstruction trend is about half of that seen in the Steig reconstruction. Note that most warming occurred prior to 1972.
Again, I’m not trying to say this is the correct reconstruction or that this is any more valid than that done by Steig. In fact, beyond the peninsula and coast data is so sparse that I doubt any reconstruction is accurate. This is simply to demonstrate that RegEM doesn’t realize that 40% of the occupied station data came from less than 5% of the land mass when it does its infilling. Because of this, the results can be affected by changing the spatial distribution of the predictor data (i.e. occupied stations).