Another Episode Of Cleaning The Augean Stables

Guest Post by Willis Eschenbach (@weschenbach on eX-Twitter)

Our estimable Charles the Moderator, who gets my eternal thanks for keeping the hits happening here on WUWT, asked me to take a look at a new paper yclept Multivariate Analysis Rejects the Theory of Human-caused Atmospheric Carbon Dioxide Increase:The Sea Surface Temperature Rules by Dai Ato, an independent researcher in Japan. Seems it’s been getting some play. I’ll refer to this paper as Ato2024.

I wasn’t far in before alarms went off. The study conducted a multivariate analysis using publicly available data to examine the impact of sea surface temperature (SST) and human emissions on atmospheric CO₂ levels.

It concluded that SST was the independent determinant of the annual increase in atmospheric CO₂ concentration. Human emissions were found to be irrelevant in the regression models.

And most revealingly, it says:

Furthermore, the atmospheric CO₂ concentration predicted, using the regression equation obtained for the SST derived from UK-HADLEY centre after 1960, showed an extremely high correlation with the actual CO₂ concentration (Pearson correlation coefficient r = 0.9995, P < 3e-92).

BZZZZT!! Whenever I get an r-value that high, I know for a fact that I’m doing something very wrong … I’ll get back to that.

First, let me start with one of the three variables in their analysis, which are SST, CO2, and emissions. Here are three reconstructions of SST since 1854 by three different groups.

Figure 1. Global monthly average sea surface temperatures (SSTs). Yellow area at the right is the portion of the record analyzed by Ato.

While there are some differences, overall the pattern is clear. There was SST warming from ~ 1850 to about 1870, cooling to ~ 1910, warming to ~ 1940, cooling to ~ 1965, and warming since then.

Looking at that, I can see why Ato doesn’t want to use the full record—it doesn’t support his claim that SSTs are the independent determinant of atmospheric CO2 levels. The CO2 data (Figs. 2 and 3 below) looks nothing like that.

So how does he justify the cutoff? Well, the Mauna Loa CO2 measurement data starts about 1960. However, it can be extended back beyond that using the ice core CO2 records. Here’s what that looks like.

Figure 2. Mauna Loa and ice core measurements of the background atmospheric CO2 levels, 1000-2010AD. Data: Ice Cores Mauna Loa

Ato2024 says that the ice core records are not accurate. However, this is belied by the close agreement of the ice core records with each other and with the Mauna Loa measurements as shown above.

Below is a closer view of the recent end of the data since 1850, corresponding to the time frame of the sea surface temperatures (SSTs) in Figure 1.

Figure 3. As in Figure 2, but post-1850 data only

As a result of the good agreement of the ice cores both with each other and with the Mauna Loa data, I see no problem in taking that as a good reconstruction of the post-1850 CO2 levels.

The problem, of course, is that the pre-1960 ocean temperatures do not look anything like the pre-1960 CO2 levels … and this disagreement totally falsifies Ato2024. So he is obliged to ignore it.

Next, how did he get such a great correlation, 0.9995, between SST and CO2 in the post-1960 data? In part the answer lies in what he looked at. Here’s the Mauna Loa post-1960 CO2 record he used. Note that he didn’t use the monthly data, just the annual data. Makes it easier to get a higher Pearson correlation coefficient “r”.

Figure 4. Mauna Loa Observatory CO2 observations, along with the linear trend line.

The recent increase in CO2 is a very slowly accelerating curve which is nearly a straight line. This leads to many false correlations because such a curve is easy to replicate as we’ll see below. This is a recurring problem in climate science.

But that’s just the first problem. The main problem is the procedure that he used. Here’s the description from the paper.

Note that the symbol delta (∆) in the equations means “change in”. So ∆CO2 is the change in CO2 from one year to the next.

Translated, that says:

  • Calculate the best-fit linear estimation of the annual changes in CO2 (∆CO2), based on the Hadley HadSST sea surface temperature.
  • The predicted atmospheric CO2 is then the starting atmospheric CO2 plus the cumulative sum of the estimated annual changes in CO2.

Here’s a graph of the first part of that calculation, fitting the SST to the annual change in CO2.

Figure 5. Post 1960 annual change in atmospheric CO2 (∆CO2), along with the linear trend line of ∆CO2, and the best estimation of ∆CO2 based on the Hadley HadSST4.0.1.

Now, there’s an oddity about graphing delta CO2, or ∆ anything for that matter. It involves a couple of curious changes. I’ll use graphing ∆CO2 as in Fig. 5 as my example.

First, any overall linear trend in the CO2 data is converted into an overall offset from zero (a non-zero average) in the ∆CO2 graph.

Second, any overall acceleration in the CO2 data is converted into an overall linear trend in the ∆CO2 graph.

So from looking at Figure 5, we can see that the ∆CO2 data has both a positive trend and an acceleration. We can see both of those in Figure 3 above.

And now that we’ve fitted the SST to the ∆CO2 data so we can estimate the ∆CO2, we simply sum those changes cumulatively to estimate the underlying CO2 data. Here’s that result.

Figure 6. Mauna Loa CO2 data, and Ato2024 estimation of the Mauna Loa CO2 data

At this point, I’ve replicated his results.

Now, remember that I said that a correlation coefficient of 0.999+ means there’s some fatal flaw in the logic. So … what’s not to like?

In his note asking me to take a look at this paper, Charles The Moderator included an interesting AI analysis of the paper, viz (emphasis mine):

Based on my analysis of the paper, the key issue of circular reasoning appears to be in the methodology used to predict atmospheric CO2 concentrations from sea surface temperature (SST) data. Specifically:

The author uses multiple linear regression to derive an equation relating annual CO2 increase to SST for the period 1960-2022.

This equation is then used to “predict” CO2 concentrations for the same 1960-2022 time period.

The predicted and measured CO2 concentrations are found to have an extremely high correlation (r = 0.9995).

The circular reasoning occurs because the same data is used both to derive the equation and to test its predictive power. The key equations involved are:

The regression equation (from Step 7 in the paper):

Annual CO2 increase = 2.006 × HAD-SST + 1.143 (after 1959)

The prediction equation:

[CO2]n = Σ[ΔCO2]i + Cst

Where [CO2]n is the predicted CO2 concentration, [ΔCO2]i is the annual increase calculated from the regression equation, and Cst is the actual CO2 concentration in the starting year.

By using this method, the author is essentially fitting the equation to the data and then using that same fitted equation to “predict” the data it was derived from. This guarantees an extremely high correlation that does not actually demonstrate any predictive power or causal relationship.

A proper analysis would use separate training and testing datasets, or employ techniques like cross-validation, to avoid this circularity.

The extremely high correlation reported is almost certainly an artifact of this flawed methodology rather than evidence of a genuine relationship between SST and atmospheric CO2 levels.

And the AI is right. Well, partly right. They’re right to say that the problem is not that Ato2024 fitted SST to CO2. The problem is that Ato2024 didn’t withhold half the data to verify the results. It’s easy to predict something when you already know the outcome …

HOWEVER, and it’s a big however … while that problem alone is enough to totally falsify the conclusions, there’s another really big problem. To illustrate that, I’ve used the Ato2024 method. But instead of using sea surface temperature as the input to be fitted to the ∆CO2 data as Ato2024 does, I’ve fitted a straight line to the ∆CO2 data. It’s the blue line in Figure 5 above.

And using the Ato2024 method, I’ve converted that straight line to the equivalent CO2 data shown in red in Figure 7 below.

Figure 7. As in Figure 6, plus a red line showing the result of using a simple straight line in place of the sea surface temperature (SST) used the Ato2024.

Interesting. Using the Ato2024 method of fitting a variable to ∆CO2, a straight line as input does just as as well as using the SST as input.

But that doesn’t really show the full scope of the problem. To do that, I first divided the SST, the straight line, and ∆CO2 data in two halves. I used the first half for fitting either the SST or the straight line to the first half of the ∆CO2. Then I used those results to estimate the change in CO2. Figure 8 shows that result.

Figure 8. As in Figure 7, but using only the first half of the data to fit the model, and then using the full data to see how well it performs.

This graph reveals two separate problems. First, although the fit is considerably poorer than in Figure 6, the Pearson correlation coefficient “r” is basically unchanged … meaning that it is not an appropriate measure for this particular issue.

Next, the straight line continues to perform just as well as using the SST as the independent variable … no bueno. This indicates a profound problem with the underlying Ato method.

To show the problem, I’m gonna re-show Figure 5 from above.

To recap, first, any overall linear trend in the CO2 data is converted into an overall offset from zero (a non-zero average) in the ∆CO2 graph.

Second, any overall acceleration in the CO2 data is converted into an overall linear trend in the ∆CO2 graph.

And here’s the key. When you fit the SST data (or more importantly, any data) to the ∆CO2 data, you end up with a fitted signal that has the same non-zero average and the same trend as the ∆CO2 data.

Not only that, but the fit will be balanced, with the amount above and the amount below the trend line being equal.

And all of that guarantees that if you start out trying to predict a smooth curve, when you reconstruct the signal using the method of Ato2024, you’ll get an answer that is VERY close to the smooth curve regardless of what variable you use to reconstruct the signal.

And that is why using the straight line does just as well as using the SST, or any other variable, as the basis for the estimation of CO2.

I weep for the death of honest peer-review …

My best to everyone,

w.

Yeah, you’ve heard it before: When you comment please quote the exact words you’re discussing. It avoids endless misunderstandings.

Get notified when a new post is published.
Subscribe today!
4.8 34 votes
Article Rating
133 Comments
Inline Feedbacks
View all comments
September 13, 2024 9:23 am

Dear Willis,

Excellent debunking of bad (non)”skeptical” science…

At a start, I have been discussing the late Dr. Jawoworski’s stance against ice core CO2 analysis and all I can hope is that he rests in peace together with his wrong ideas about ice core performance. Here reflected at:
http://www.ferdinand-engelbeen.be/klimaat/jaworowski.html

Then about the recent works of Ato and others, comparing temperature variability with CO2 variability.
First, the ratio between CO2 and T over the past 800,000 years was about 8 ppmv/°C. Here reflected in the Vostok ice core over 420,000 years:
comment image
Data not compensated for the (long) lags of CO2 changes after T changes.
As the temperature proxy is mostly where the snow is formed, near Antarctica, that may be around 16 ppmv/°C for global temperatures,

For the current ocean temperature influence on CO2 in the atmosphere, we have the formula of Takahashi, based on hundred thousands of sea surface samples:
(pCO2)seawater Tnew = (pCO2)seawater Told x EXP[0.0423 x (Tnew – Told)]
http://www.sciencedirect.com/science/article/pii/S0967064502000036
Or some 4.3%/°C in/decrease, independent of the composition of the sample or the start temperature.

That makes that the influence of increasing SST since the LIA is not more than some 9 ppmv in the atmosphere (with maximum 0.8°C increase)… Peanuts compared to the over 100 ppmv increase after 1958 and the over 200 ppmv human emissions over the same period.
Here plotted:
comment image

Reply to  Ferdinand Engelbeen
September 13, 2024 9:34 am

Second part…

The formula of Takahashi should read:

(pCO2)seawater AT Tnew = (pCO2)seawater AT Told x EXP[0.0423 x (Tnew – Told)]

That being said…

The main error of Ato (and others) is that they use temperature (anomaly) against CO2 rate of change, not T rate of change against CO2 rate of change or T against [CO2]…
 
If you plot the derivatives against each other, then things become very clear:
comment image
Temperature rate of change (derivative of 12 month running average) shows all the variability and no trend at all (!), while human emissions is all trend and hardly any variability, twice the trend of the increase in the atmosphere.
Thus temperature is responsible for all the variability (with a lag of about 6 months) in the CO2 rate of change, but not responsible for the trend of the rate of change at all, thus not responsible for the bulk of the increase in the atmosphere…
 
The difference between T rate of change or T changes against CO2 rate of change is interesting too:
comment image
While dT/dt and T variability are very similar, dt/dt has no trend, but T has and is shifted about pi/2 more to the right, fully synchronizing with the dCO2/dt variability…
Any sinusoid shows such behavior, but when synchronized, the lead/lag between T and dCO2/dt is gone…

Henry Pool
Reply to  Ferdinand Engelbeen
September 13, 2024 10:16 am

This is the usual trash from F, for some reason, wanting to prove that he is right. But there us no one elsewhere who actually supports him
..

Reply to  Henry Pool
September 13, 2024 1:01 pm

I have had my problems with some of his comments too and over 15 years’ time, but he is NEVER impolite or rude, he is the paragon of being a fine gentleman who is very polite in his replies.

Try not to be so wound up about it since it is clear NO ONE here has the full answer on this topic.

Reply to  Sunsettommy
September 13, 2024 1:25 pm

Thanks Tommy, after 15 years of discussions on several fora, it is nice to see the old guys again…

Henry Pool
Reply to  Ferdinand Engelbeen
September 13, 2024 2:43 pm

Tommy
Don’ t make a mistake. He is the one responsible for spreading the CO2 nonsense to the IPCC.
In fact, he is going there now, to Athens, to cook his nonsense story and present it on the IPPC dinner table.

Reply to  Henry Pool
September 14, 2024 12:55 am

Henry, please calm down…

My lecture in Athens will be the point of view of the CO2 Coalition, not that of the IPCC. That view is as in the above explanation: that human emissions are the cause of the increase in the atmosphere, but that doesn’t imply that it has “catastrophic” consequences, indeed far more benefits than negative results. Contrary to what the IPCC says.

One part of my lecture will be on the Bern model that the IPCC uses and “predicts” very long residence times for the extra CO2 in the atmosphere of hundreds to thousands of years. which is completely at odds with reality…

Many of the participants are skeptics who think the same way as Ato here in discussion and maybe one in the (small group of) public that is an IPCC believer…

Henry Pool
Reply to  Ferdinand Engelbeen
September 14, 2024 5:40 am

I am now truly very sorry. I jumped the gun since I knew you were involved previously with the ipcc. I am glad if this is for the CO2 Coalition which I also support. Please accept my sincere apologies. Obviously we still have differing opinions on the chemistry of the oceans.

Reply to  Henry Pool
September 14, 2024 11:11 am

Apologies accepted, although not necessary, I know you already some (long) time now…
But as far as I know, I was never involved with the IPCC (that would be a miracle), which doesn’t mean that on some points the IPCC can’t be right, and that humans are the cause of the increase is one of them where they are right…