Steig et al 'Antarctica Warming Paper' process is finally replicated, and dealt a blow to "robustness".

Jeff Id emailed me today, to ask if I wanted to post this with the caveat “it’s very technical, but I think you’ll like it”. Indeed I do, because it represents a significant step forward in the puzzle that is the Steig et all paper published in Nature this year ( Nature, Jan 22, 2009) that claims to have reversed the previously accepted idea that Antarctica is cooling. From the “consensus” point of view, it is very important for “the Team” to make Antarctica start warming. But then there’s that pesky problem of all that above normal ice in Antarctica. Plus, there’s other problems such as buried weather stations which will tend to read warmer when covered with snow.  And, the majority of the weather stations (and thus data points) are in the Antarctic peninsula, which weights the results. The Antarctic peninsula could even be classified under a different climate zone given it’s separation from the mainlaind and strong maritime influence.

A central prerequisite point to this is that Steig flatly refused to provide all of the code needed to fully replicate his work in MatLab and RegEM, and has so far refused requests for it. So without the code, replication would be difficult, and without replication, there could be no significant challenge to the validity of the Steig et al paper.

Steig’s claim that there has been “published code” is only partially true, and what has been published by him is only akin to a set of spark plugs and a manual on using a spark plug wrench when given the task of rebuilding an entire V-8 engine.

In a previous Air Vent post, Jeff C points out the percentage of code provided by Steig:

“Here is an excellent flow chart done by JeffC on the methods used in the satellite reconstruction. If you see the little rectangle which says RegEM at the bottom right of the screen, that’s the part of the code which was released, the thousands of lines I and others have written for the rest of the little blocks had to be guessed at, some of it still isn’t figured out yet.”

http://noconsensus.files.wordpress.com/2009/04/steigflowrev4-6-09.jpg?w=598&h=364
RegEM Satellite data flow chart. Courtesy Jeff C - click for larger image

With that, I give you Jeff and Ryan’s post below. – Anthony

Antarctic Coup de Grace

Posted by Jeff Id on May 20, 2009

I was going to hold off on this post because Dr. Weinstein’s post is getting a lot of attention right now it has been picked up on several blogs and even translated into different languages but this is too good not to post.

Ryan has done something amazing here, no joking. He’s recalibrated the satellite data used in Steig’s Antarctic paper correcting offsets and trends, determined a reasonable number of PC’s for the reconstruction and actually calculated a reasonable trend for the Antarctic with proper cooling and warming distributions – He basically fixed Steig et al. by addressing the very concern I had that AVHRR vs surface station temperature(SST) trends and AVHRR station vs SST correlation were not well related in the Steig paper.

Not only that he demonstrated with a substantial blow the ‘robustness’ of the Steig/Mann method at the same time.

If you’ve followed this discussion whatsoever you’ve got to read this post.

RegEM for this post was originally transported to R by Steve McIntyre, certain versions used are truncated PC by Steve M as well as modified code by Ryan.

Ryan O – Guest post on the Air Vent

I’m certain that all of the discussion about the Steig paper will eventually become stale unless we begin drawing some concrete conclusions. Does the Steig reconstruction accurately (or even semi-accurately) reflect the 50-year temperature history of Antarctica?

Probably not – and this time, I would like to present proof.

I: SATELLITE CALIBRATION

As some of you may recall, one of the things I had been working on for awhile was attempting to properly calibrate the AVHRR data to the ground data. In doing so, I noted some major problems with NOAA-11 and NOAA-14. I also noted a minor linear decay of NOAA-7, while NOAA-9 just had a simple offset.

But before I was willing to say that there were actually real problems with how Comiso strung the satellites together, I wanted to verify that there was published literature that confirmed the issues I had noted. Some references:

(NOAA-11)

Click to access i1520-0469-59-3-262.pdf

(Drift)

Click to access orbit.pdf

(Ground/Satellite Temperature Comparisons)

Click to access p26_cihlar_rse60.pdf

The references generally confirmed what I had noted by comparing the satellite data to the ground station data: NOAA-7 had a temperature decrease with time, NOAA-9 was fairly linear, and NOAA-11 had a major unexplained offset in 1993.

Fig_1
Fig. 1: AVHRR trend (points common with ground data).

Let us see what this means in terms of differences in trends.

Fig_2
Fig. 2: Difference in trend between AVHRR data and ground data.

The satellite trend (using only common points between the AVHRR data and the ground data) is double that of the ground trend. While zero is still within the 95% confidence intervals, remember that there are 6 different satellites. So even though the confidence intervals overlap zero, the individual offsets may not.

In order to check the individual offsets, I performed running Wilcoxon and t-tests on the difference between the satellites and ground data using a +/-12 month range. Each point is normalized to the 95% confidence interval. If any point exceeds +/- 1.0, then there is a statistically significant difference between the two data sets.

Fig_3
Fig. 3: Results of running Wilcoxon and t-tests between satellite and ground data.

Note that there are two distinct peaks well beyond the confidence intervals and that both lines spend much greater than 5% of the time outside the limits. There is, without a doubt, a statistically significant difference between the satellite data and the ground data.

As a sidebar, the Wilcoxon test is a non-parametric test. It does not require correction for autocorrelation of the residuals when calculating confidence intervals. The fact that it differs from the t-test results indicates that the residuals are not normally distributed and/or the residuals are not free from correlation. This is why it is important to correct for autocorrelation when using tests that rely on assumptions of normality and uncorrelated residuals. Alternatively, you could simply use non-parametric tests, and though they often have less statistical power, I’ve found the Wilcoxon test to be pretty good for most temperature analyses.

Here’s what the difference plot looks like with the satellite periods shown:

Fig_4
Fig. 4: Difference plot, satellite periods shown.

The downward trend during NOAA-7 is apparent, as is the strange drop in NOAA-11. NOAA-14 is visibly too high, and NOAA-16 and -17 display some strange upward spikes. Overall, though, NOAA-16 and -17 do not show a statistically significant difference from the ground data, so no correction was applied to them.

After having confirmed that other researchers had noted similar issues, I felt comfortable in performing a calibration of the AVHRR data to the ground data. The calculated offsets and the resulting Wilcoxon and t-test plot are next:

Fig_5
Fig. 5: Calculated offsets.
Fig_6
Fig. 6: Post-calibration Wilcoxon and t-tests

To make sure that I did not “over-modify” the data, I ran a Steig (3 PC, regpar=3, 42 ground stations) reconstruction. The resulting trend was 0.1079 deg C/decade and the trend maps looked nearly identical to the Steig reconstructions. Therefore, the satellite offsets – while they do produce a greater trend when not corrected – do not seem to have a major impact on the Steig result. This should not be surprising, as most of the temperature rise in Antarctica occurs between 1957 and 1970.

II: PCA

One of the items that we’ve spent a lot of time doing sensitivity analysis is the PCA of the AVHRR data. Between Jeff Id, Jeff C, and myself, we’ve performed somewhere north of 200 reconstructions using different methods and different numbers of retained PCs. Based on that, I believe that we have a pretty good feel for the ranges of values that the reconstructions produce, and we all feel that the 3 PC, regpar=3 solution does not accurately reproduce Antarctic temperatures. Unfortunately, our opinions count for very little. We must have a solid basis for concluding that Steig’s choices were less than optimal – not just opinions.

How many PCs to retain for an analysis has been the subject of much debate in many fields. I will quickly summarize some of the major stopping rules:

1. Kaiser-Guttman: Include all PCs with eigenvalues greater than the average eigenvalue. In this case, this would require retention of 73 PCs.

2. Scree Analysis: Plot the eigenvalues from largest to smallest and take all PCs where the slope of the line visibly ticks up. This is subjective, and in this case it would require the retention of 25 – 50 PCs.

3. Minimum explained variance: Retain PCs until some preset amount of variance has been explained. This preset amount is arbitrary, and different people have selected anywhere from 80-95%. This would justify including as few as 14 PCs and as many as 100.

4. Broken stick analysis: Retain PCs that exceed the theoretical scree plot of random, uncorrelated noise. This yields precisely 11 PCs.

5. Bootstrapped eigenvalue and eigenvalue/eigenvector: Through iterative random sampling of either the PCA matrix or the original data matrix, retain PCs that are statistically different from PCs containing only noise. I have not yet done this for the AVHRR data, though the bootstrap analysis typically yields about the same number (or a slightly greater number) of significant PCs as broken stick.

The first 3 rules are widely criticized for being either subjective or retaining too many PCs. In the Jackson article below, a comparison is made showing that 1, 2, and 3 will select “significant” PCs out of matrices populated entirely with uncorrelated noise. There is no reason to retain noise, and the more PCs you retain, the more difficult and cumbersome the analysis becomes.

The last 2 rules have statistical justification. And, not surprisingly, they are much more effective at distinguishing truly significant PCs from noise. The broken stick analysis typically yields the fewest number of significant PCs, but is normally very comparable to the more robust bootstrap method.

Note that all of these rules would indicate retaining far more than simply 3 PCs. I have included some references:

Click to access pca.pdf

Click to access North_et_al_1982_EOF_error_MWR.pdf

I have not yet had time to modify a bootstrapping algorithm I found (it was written for a much older version of R), but when I finish that, I will show the bootstrap results. For now, I will simply present the broken stick analysis results.

Fig_7

Fig. 7: Broken Stick Analysis on AVHRR data.

The broken stick analysis finds 11 significant PCs. PCs 12 and 13 are also very close, and I suspect the bootstrap test will find that they are significant. I chose to retain 13 PCs for the reconstruction to follow.

Without presenting plots for the moment, retaining more than 11 PCs does not end up affecting the results much at all. The trend does drop slightly, but this is due to better resolution on the Peninsula warming. The rest of the continent does not change if additional PCs are added. The only thing that changes is the time it takes to do the reconstruction.

Remember that the purpose of the PCA on the AVHRR data is not to perform factor analysis. The purpose is simply to reduce the size of the data to something that can be computed. The penalty for retaining “too many” – in this case – is simply computational time or the inability for RegEM to converge. The penalty for retaining too few, on the other hand, is a faulty analysis.

I do not see how the choice of 3 PCs can be justified on either practical or theoretical grounds. On the practical side, RegEM works just fine with as many as 25 PCs. On the theoretical side, none of the stopping criteria yield anything close to 3. Not only that, but these are empirical functions. They have no direct physical meaning. Despite claims in Steig et al. to the contrary, they do not relate to physical processes in Antarctica – at least not directly. Therefore, there is no justification for excluding PCs that show significance simply because the other ones “look” like physical processes. This latter bit is a whole other discussion that’s probably post worthy at some point, but I’ll leave it there for now.

III: RegEM

We’ve also spent a great deal of time on RegEM. Steig & Co. used a regpar setting of 3. Was that the “right” setting? They do not present any justification, but that does not necessarily mean the choice is wrong. Fortunately, there is a way to decide.

RegEM works by approximating the actual data with a certain number of principal components and estimating a covariance from which missing data is predicted. Each iteration improves the prediction. In this case (unlike the AVHRR data), selecting too many can be detrimental to the analysis as it can result in over-fitting, spurious correlations between stations and PCs that only represent noise, and retention of the initial infill of zeros. On the other hand, just like the AVHRR data, too few will result in throwing away important information about station and PC covariance.

Figuring out how many PCs (i.e., what regpar setting to use) is a bit trickier because most of the data is missing. Like RegEM itself, this problem needs to be approached iteratively.

The first step was to substitute AVHRR data for station data, calculate the PCs, and perform the broken stick analysis. This yielded 4 or 5 significant PCs. After that, I performed reconstructions with steadily increasing numbers of PCs and performed a broken stick analysis on each one. Once the regpar setting is high enough to begin including insignificant PCs, the broken stick analysis yields the same result every time. The extra PCs show up in the analysis as noise. I first did this using all the AWS and manned stations (minus the open ocean stations).

Fig_8

Fig. 8: Broken stick analysis on manned and AWS stations, regpar = 8.

Fig_9

Fig. 9: Broken stick analysis on manned and AWS stations, regpar=12.

I ran this all the way up to regpar=20 and the broken stick analysis indicates that 9 PCs are required to properly describe the station covariance. Hence the appropriate regpar setting is 9 if all the manned and AWS stations are used. It is certainly not 3, which is what Steig used for the AWS recon.

I also performed this for the 42 manned stations Steig selected for the main reconstruction. That analysis yielded a regpar setting of 6 – again, not 3.

The conclusion, then, is similar to the AVHRR PC analysis. The selection of regpar=3 does not appear to be justifiable. Additional PCs are necessary to properly describe the covariance.

IV: THE RECONSTRUCTION

So what happens if the satellite offsets are properly accounted for, the correct number of PCs are retained, and the right regpar settings are used? I present the following panel:

Fig_10

Fig. 10: (Left side) Reconstruction trends with the post-1982 PCs spliced back in (Steig’s method).

(Right side) Reconstruction trends using just the model frame.

RegEM PTTLS does not return the entire best-fit solution (the model frame, or surface). It only returns what the best-fit solution says the missing points are. It retains the original points. When imputing small amounts of data, this is fine. When imputing large amounts of data, it can be argued that the surface is what is important.

RegEM IPCA returns the surface (along with the spliced solution). This allows you to see the entire solution. In my opinion, in this particular case, the reconstruction should be based on the solution, not a partial solution with data tacked on the end. That is akin to doing a linear regression, throwing away the last half of the regression, adding the data back in, and then doing another linear regression on the result to get the trend. The discontinuity between the model and the data causes errors in the computed trend.

Regardless, the verification statistics are computed vs. the model – not the spliced data – and though Steig did not do this for his paper, we can do it ourselves. (I will do this in a later post.) Besides, the trends between the model and the spliced reconstructions are not that different.

Overall trends are 0.071 deg C/decade for the spliced reconstruction and 0.060 deg C/decade for the model frame. This is comparable to Jeff’s reconstructions using just the ground data, and as you can see, the temperature distribution of the model frame is closer to that of the ground stations. This is another indication that the satellites and the ground stations are not measuring exactly the same thing. It is close, but not exact, and splicing PCs derived solely from satellite data on a reconstruction where the only actual temperatures come from ground data is conceptually suspect.

When I ran the same settings in RegEM PTTLS – which only returns a spliced version – I got 0.077 deg C/decade, which checks nicely with RegEM IPCA.

I also did 11 PC, 15 PC, and 20 PC reconstructions. Trends were 0.081, 0.071, and 0.069 for the spliced and 0.072, 0.059, and 0.055 for the model. The reason for the reduction in trend was simply better resolution (less smearing) of the Peninsula warming.

Additionally, I ran reconstructions using just Steig’s station selection. With 13 PCs, this yielded a spliced trend of 0.080 and a model trend of 0.065. I then did one after removing the open-ocean stations, which yielded 0.080 and 0.064.

Note how when the PCs and regpar are properly selected, the inclusion and exclusion of individual stations does not significantly affect the result. The answers are nearly identical whether 98 AWS/manned stations are used, or only 37 manned stations are used. One might be tempted to call this “robust”.

V: THE COUP DE GRACE

Let us assume for a moment that the reconstruction presented above represents the real 50-year temperature history of Antarctica. Whether this is true is immaterial. We will assume it to be true for the moment. If Steig’s method has validity, then, if we substitute the above reconstruction for the raw ground and AVHRR data, his method should return a result that looks similar to the above reconstruction.

Let’s see if that happens.

For the substitution, I took the ground station model frame (which does not have any actual ground data spliced back in) and removed the same exact points that are missing from the real data.

I then took the post-1982 model frame (so the one with the lowest trend) and substituted that for the AVHRR data.

I set the number of PCs equal to 3.

I set regpar equal to 3 in PTTLS.

I let it rip.

Fig_11

Fig. 11: Steig-style reconstruction using data from the 13 PC, regpar=9 reconstruction.

Look familiar?

Overall trend: 0.102 deg C/decade.

Remember that the input data had a trend of 0.060 deg C/decade, showed cooling on the Ross and Weddel ice shelves, showed cooling near the pole, and showed a maximum trend in the Peninsula.

If “robust” means the same answer pops out of a fancy computer algorithm regardless of what the input data is, then I guess Antarctic warming is, indeed, “robust”.

———————————————

Code for the above post is HERE.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

168 Comments
Inline Feedbacks
View all comments
May 22, 2009 5:57 am

Basil (07:10:49) beginning: “I think the Internet is changing the rules of the game here. …”
A very fine piece, Basil. A solid brick in a wall a’building.

hunter
May 22, 2009 6:32 am

Flanagan,
Are you saying that the Mann/Steig & pals work shows the entire Antarctica continent is warming in the last several years?

Wondering Aloud
May 22, 2009 6:58 am

I have a lot more trouble with the crappy peer review Steig recieved than with the original paper.
However, Flanagan, what remains of the warming of the Steig paper is a tiny fraction of what was claimed, far smaller than the likely margin of error, and that only exists if you cherry pick start and end dates. You missed all of that?
What remains is the Antarctic is not warming though the “models” so many take as better than reality, claim it should be.
Victory isn’t the issue! Theory has to agree with observation or theory is wrong. With the current state of the data on a huge variety of sub-issues any claim that CO2 is driving large scale climate change is clearly not supported.

Layman Lurker
May 22, 2009 7:24 am

Flanagan states: “Err, to come back on the post. It really really looks like any type of temperature reconstruction actually leads to warming. So where’s the problem, again?”
The confidence interval for Steig’s warming is approximately +/-.7C per decade. In addition, Steig’s paper showed a warming trend througout the 50 year period. The reconstructions done by Jeff and Ryan show a cooling trend for the last 40 years. IOW, Ryan’s work shows no basis for inferring warming of any kind.
The point of Ryan’s work is not to show that Antarctica is or is not cooling, it is to draw attention to Steig’s faulty methodology and his claim of warming.

May 22, 2009 7:55 am

Flanagan (22:54:27) :
The difference between Steig08 and reality is simple to understand, Steig 08 cliams a temperature trend of .12+/- 0.07 C/Decade and created math that causes that trend very robustly to input.
Ryan, JeffC and myself have repeatedly shown that the true trend for a least squares fit (not always considered optimal) is half of that. So you make a reasonable sounding point that the trend line is positive. If you are open minded about it though you need to realize that the trend in surface stations for the last 40 years has been downward in our repaired, corrected, redone or even moderately improved reconstructions.
Even that isn’t enough, so as an open minded individual please consider that there were far less temperature stations active in the 1957-1967 time period than today. The creates the situation where the math in any of these reconstructions becomes affected quite heavily by the density of the stations. From memory there were less than 18 of the 42 stations in operation.
Since the trend is downward for 40 years instead of the continuous up slope on the cover of Nature, gavin (happy little g) would have to change his statement again that the longish term down slope is consistent with the models. Something he definitely won’t do.
Consistency with models and sea level rise is the point of the paper.
I hope you can understand now why this work is so entirely contradictory in result from the published form. You gave me an idea for a post after the weekend, if I’m right the result may be quite interesting.

Ryan O
May 22, 2009 8:44 am

Geoff,
It’s not quite as simple as that. There are two distinct issues with the satellites:
1. Channel degradation with time.
2. Lack of overlap for calibration.
For the first issue, the degradation of each channel is certainly a function of time (channels certainly don’t get better with time). If the AVHRR data were the result of the output of a single channel, then, with respect to channel degradation, your comment would be accurate.
In the case of the AVHRR instrument, however, there are multiple channels that are needed for cloud masking. Channels 1 and 2 are used for albedo measurements to attempt to distinguish between snow, ice, water, land, vegetation, clouds, etc. Channel 3 is also used for albedo measurements at a different wavelength. Channels 4 and 5 are used for the actual temperature measurement.
The process of cloud masking involves using channel differencing to detect the presence of clouds. So while each channel degrades over time, they do not degrade equally. Depending on the relative rates of degradation, this will result in misidentification of clear skies as cloudy or cloudy skies as clear – which means the measured temperature could go either up or down – and could even switch direction with a given satellite.
For the second issue, the satellites are not calibrated to each other. There is no overlap. The visible channels (1 & 2) are calibrated to a spot in the Libyan desert. The other channels, however, are calibrated to an internal blackbody vs. space. So for channels 3-5, the in-flight calibration only acquires two points on the calibration curve. This can very definitely result in offsets at the very beginning of the operation of the satellite, and can result in either upward or downward drifts depending on whether the background temperature of space changes due to changing quantities of space dust causing increased/decreased scatter of solar light. The temperature of the blackbody can also drift over time.

David Ball
May 22, 2009 8:46 am

Flanagan sees and hears (and comprehends) only what Flanagan wants to see and hear and comprehend. This is consistent with every advocate of AGW I have ever encountered, and they call us deniers and flat-earthers.

Ryan O
May 22, 2009 8:50 am

Geoff,
Something I forgot. You can completely detrend the satellite data and it does not greatly affect the reconstruction. Steig even did this in his paper. The reason is that the satellite data is primarily used to determine station covariance, not temperatures.

May 22, 2009 9:40 am

Mr Lynn
I tried the password but access was denied. I’ve managed to get into all your bank accounts though 🙂
Can you repeat the password exactly as it should be used?
Thanks
Tonyb

George E. Smith
May 22, 2009 11:48 am

“”” Jeff Id (20:14:12) :
I believe I’ve used a lot of George E. Smiths product inventions. I used to make a living programming CCD vision systems. “””
Well Jeff you lift me to an undeserved lofty plane with that one.
Let me make it perfectly clear for the record:- I am not now, and never have been the George E. Smith formerly of Bell Telephone LKaboratories who is famous for inventing the Charged Coupled Device (CCD) that has contributed so much to advanced imaging and to science as a result of that invention. Wasn’t me; but often when I would go to a convention, I would find out I was already registered before I got there; but it was always GES from Bell Labs. At the same time the director of R&D at Beckman Instruments was also a George E. Smith; and I am not him either; but I was at the time the VP of R&D for what was a start-up, and at one point the largest (in sales) Light Emitting Diode Company in the world. (now defunct, and merged into Siemens Corp).
But no I am not in the same league as those other George E. Smiths; nor with the Heroic one who was forced to dive overboard and swim away under a burning fuel fire from his ship which was Bombed on the morning of Dec 7th 1941 at Pearl Harbor; (well it was Dec 8th for me).
No you have to turn your computer mouse over; and if it has a ball, it was none of my doing; but if it has an LED or laser optics; then it is either my optics design or a Hong Kong Fooey kockoff of mine. Or if you happen to drive a 1996 Ford Thunderbird or some later models, you could be driving with my LED tail lights.
I did Google my name once and found me about ten pages in somewhere. I used to work at a company that had three of us working for them at the same time.

zoonotica
May 22, 2009 12:24 pm

Good effort.
But is this a critique of the peer review process or the study?
I’m quite familiar with the peer-review process (from the outside as a scientist and the inside working on editorial), and I know that scientists are the harshest critics of scientists.
I certainly commend you on your efforts, but until any of this finds its way into a peer-reviewed journal its not worth the paper its written on.

Brendan H
May 22, 2009 3:57 pm

“Reply: Different contexts, sorry. Should you call another poster a fraud, then you may have some grounds for complaint.”
Presumably, you are saying that if a poster calls another poster a fraud, there would be cause for complaint, but that it’s OK to call a third party a fraud.
In that case, if this rule is to be applied consistently, it should also be OK to call a third party a ~snip~.
“And remember, WUWT officially allows open season on Hansen and Mann. ~ charles the moderator”
I wasn’t aware of that. What about Gore, then? Or AGW people in general?

jorgekafkazar
May 22, 2009 8:58 pm

Jeff Id (17:13:09) : “jorgekafkazar (16:47:40) : I need to point out that Mann is a coauthor of this paper so we would expect some comparison.”
Well, yes, but Mann shows up as coauthor in so many places, it’s my take that he’s just there to pad the list, make it look like there’s a consensus.
“I agree with you excepting the Monday morning quarterback. Steig is the one paid to do this day in and day out. We are those who do it in our spare time, unfunded and in this case for understanding.”
That is certainly true. And I agree that the Steig trend (though not actually based on cherry-picked data) is less relevant than the last 40 year trend, especially when you take into account the low station count* in the 50’s and the recent growth in Antarctic ice extent.
* A friend of mine was at one of those early stations for a year.
There’s very little wiggle room for AGW in Antarctic climate. Did Dr. Steig try to create some by using “optimized” methodology? I’m not sure, and I’d like to hear what he has to say.
I’d also like to hear what Nature has to say about “peer review,” which apparently now has all the effectiveness and relevance of a Kool-Aid cocktail party.

Geoff Sherrington
May 22, 2009 10:58 pm

Ryan O (08:50:49) :
Thank you for the explanation. I was aware that the answer might involve multichannels. Here, I simply make the point that I was not “hoping” for a rise or fall in the final outcome, I was simply wanting to suggest another way that might make the science less error prone.

Brendan H
May 22, 2009 11:52 pm

“In that case, if this rule is to be applied consistently, it should also be OK to call a third party a ~snip~.”
So I guess that’s a no.
Staying with this issue for a moment, Charles the moderator’s original reply to my query at (2:50:35) was: “Should you call another poster a fraud, then you may have some grounds for complaint.”
Why is it unacceptable to accuse another poster of fraud, but acceptable to accuse a third party of fraud?

Ryan O
May 23, 2009 7:03 am

Geoff,
I initially tried it the way you mentioned. It resulted in forcing the satellite data to “trend” in order to make the mean difference for the satellite zero – in other words, I ended up with a bunch of zig-zags with slopes that were statistically different from zero – so that was obviously not the right answer. The only way to resolve this is to allow for a beginning-of-life offset. Some of the literature about the AVHRR instrument shows that other researchers have noted the same thing.
The second way I tried was to regress the satellite data against the ground data. The offsets naturally appear as a result of that regression. Unfortunately, the regression resulted in some artifacting and did not properly account for the NOAA-11 drop around ~1993.
In the end, I decided the simple offsets were the easiest and safest way to perform the calibration. They avoid over-fitting criticisms and do not result in a significant change in the reconstruction trend. The result of the calibration mainly appears in how much variation the first PC explains, so what it really does is change the satellite covariance to more closely match the ground station covariance by changing the weights of the PCs.

Roger Knights
May 24, 2009 4:22 pm

“If “robust” means the same answer pops out of a fancy computer algorithm regardless of what the input data is, then I guess Antarctic warming is, indeed, “robust”.”
Or “well-defended,” anyway.

BB
May 26, 2009 11:23 am

Jeff: you say “I have not yet had time to modify a bootstrapping algorithm I found (it was written for a much older version of R), but when I finish that, I will show the bootstrap results.” I have a Java library that contains bootstrapping code. If you could use it, feel free to email me and I will get you a copy.
Reply: I will send him your message and email address. ~ charles the moderator

1 5 6 7