Modeling HadCRUT5 with CO2 and without CO2

By Andy May

I hate statistics, as many of you know. Some people think statistics and/or statistical models that meet standard statistical criteria are facts. The IPCC can be like that. They statistically model global surface temperatures with models of volcanic and anthropogenic forcing and compare the model to one with only volcanic forcing. Then they turn to us, with a straight face, and say the comparison shows anthropogenic forcing is driving all the warming. What about solar? Oh, they considered that they say, the Sun makes no difference, see their chart in figure 1 from AR6.^[1] Solar is assumed to be zero and volcanism is small, thus the model assumes all recent warming is due to humans, then draws the same conclusion in a perfect example of circular reasoning. But what if the solar forcing is not zero? What difference does that make?

Figure 1. The IPCC AR6 assumed forces affecting global surface warming translated to degrees C. From AR6 WG1, page 961.

Numerous papers have been published that show the Sun could have more impact on global temperatures and climate change than assumed by the IPCC.^[2] We must remember that statistical models are not evidence or theories, they aren’t even proper hypotheses. They are just a tool to test the validity of ideas and a hypothesis might come out of a statistical model, but proof never will. If a model repeatedly predicts the future accurately, then it is evidence the hypothesis is correct, it isn’t proof. The IPCC presents their statistical climate model with the plots shown in figure 2.

Figure 2 is quite busy, but what it says, in brief, is that they assume that natural warming (heavy green line) is zero, which makes, under their assumptions, all warming due to human activities. The WG1 AR6 report is 2,391 pages long, but figure 2, modified slightly from what they display on page 441, really encapsulates everything it proposes. The rest is filler.

Figure 2. The IPCC model shows their greenhouse gas warming hypothesis with this graph. This is after IPCC AR6 WG1 figure 3.9b (page 441). The vertical axis is the temperature anomaly relative to 1850-1900.

There are numerous problems with figure 2, but we will focus on the comparison between the anthropogenic + natural models, in orange, and the observations in black. First of all, the orange is not one model, but the average of many selected models. The range of model calculations (5 to 95^th percentile) is shown with light orange shading. The range is quite large, if they had confidence in their models wouldn’t they choose the best one and use it? If they don’t trust the models, why try to use them as evidence that the Sun has no influence, and all the warming is due to human activities? Why use the models to confidently predict a man-made climate catastrophe? AR6 WGII Summary for Policymakers (p 12-20) reports high confidence in many future catastrophes based on model results. Why high confidence, if the models are so imprecise, that they must be averaged? Second, they use thick lines to try and obscure the differences between the black and orange lines, but the differences are significant, especially between 1935 and 1976 and 1980 to 2000. The model average between 1920 and 1960 looks almost hand-drawn because it is so straight relative to rising temperatures until 1944 and falling temperatures afterward.

So, let’s take a different approach. The classical paleoclimate literature, pre-IPCC, mostly thought that solar variability dominated climate change.^[3] Over time the study of the cosmogenic isotopes ¹⁴C^[4] in tree rings and ¹⁰Be^[5] in ice cores has led to accepted proxy records of the Sun’s output that go back thousands of years (see the discussion of Carbon-14 and Beryllium-10 here).^[6] These isotopes are created in the atmosphere when galactic cosmic rays make it through the solar magnetic field and impact the atmosphere. When solar output is high, its magnetic field is stronger than when it is low. Thus, low concentrations of ¹⁴C^[7] and ¹⁰Be^[8] suggest a strong solar output and vice versa. Since 1700 sunspot records provide a more accurate view of solar activity.^[9]

Studies of ¹⁴C, ¹⁰Be, and sunspot records have uncovered four major long-term solar cycles. These are the Hallstatt (or Bray) cycle of about 2,400 years,^[10] the Eddy Cycle of about 1,000 years,^[11] the de Vries (or Suess) cycle of about 210 years, Feynman (or Gleissberg) cycle of about 105 years,^[12] and the Pentadecadal cycle of about 50 years.^[13] All the cycle periods are approximate, further, they may vary over geological time.^[14] Some may not like my use of the term “cycles,” since our understanding of the cycle periods and the strength or power of each cycle is poor. Perhaps the term oscillation would be better but understand that I fully appreciate how poorly we understand these cycles and use the term only for convenience and not necessarily according to the precise definition of the word.

The Sun is a dynamo and generates a magnetic field that controls the variations in its output over time. Such a dynamo will have cycles, we have shown they exist and affect Earth’s climate, but the details are sketchy. What astrophysicists and paleoclimatologists have done is observe the Sun and solar impacts on Earth’s climate and recognized in-phase patterns of both solar activity and climate impacts. We discuss these observed (but only approximate) patterns in the post and correlate them to HadCRUT5. Cycles are also observed in other stars that are like our Sun.^[15]

There are also shorter periods of solar variability, like the sunspot cycle which has a varying period and asymmetrical shape that averages about 11 years.^[16] Finally, we have the ENSO cycle, also with a varying period, that is driven, in part, by solar activity.^[17] To cover the shorter solar cycles we include the SILSO sunspot record^[18] and the ERSST Niño 3.4 (ENSO) record from KNMI.^[19]

If we ignore the IPCC assumption that solar activity has played no role in climate change since 1750, as suggested in figure 1, it is possible to investigate the correlation of these well-established cycles or oscillations and one of the global surface temperature records used in AR6, the HadCRUT5^[20] record. Unfortunately, the HadCRUT5 global surface temperature record only goes back to 1850, but it is an instrumental record, and preferable to proxies. The data used to build HadCRUT5 is poor prior to 1958,^[21] so we will also investigate the even shorter period of more accurate data from 1958 to 2023.

We used statistical multiple regression to see how well these cycles and data can predict HadCRUT5. We understand going in, that even if we can build a multiple regression model with a high R² (Coefficient of determination or the square of the correlation coefficient), we haven’t proven anything. We also understand that while global average surface temperature is an important metric of climate change, it is not the only important metric. Other metrics, such as mid-latitude wind speed and direction, as well as surface temperature trends at the poles, and in the tropics (especially in the middle troposphere^[22]) are also important. The purpose of this post is simply to show that the IPCC’s choice to characterize the correlation of the trends in the logarithm of CO₂ concentration and global average surface temperature as “proof” or “evidence” that CO₂ and other human greenhouse gas emissions drive climate change is not very solid. In fact, it is probably wrong. Other reasonable correlations are possible, and arguably better.

Figure 3 is a plot of the independent or predictor variables used in our regression study. They have been normalized to scales of -3 to +3 by dividing the larger variables (Log(CO₂) and sunspots) by their mean to better compare the variables to one another. In addition, we divided the sunspot number by its standard deviation to help make it comparable in scale to other variables.

Figure 3. The input series used in this multiple regression study. The y axis scale is an index, and the curves cannot be compared quantitatively.

Unfortunately, our period is too short to properly evaluate some of the stronger climate cycles, like the Hallstatt (light blue) and Eddy (orange) cycles. These two cycles bottomed in the Little Ice Age and their periods are so long they almost appear as straight lines, but they are increasing like the HadCRUT5 record. The logarithm of CO₂^[23] is also nearly a straight line, and very slightly increasing. The CO₂ data are interpolated yearly averages to avoid the seasonal wiggles.

The ENSO 3.4, sunspots (SN Norm), and CO₂ (Log CO₂ Norm) records used in the study are from well-known datasets.^[24] The longer-term solar cycles are created using a sinusoid function^[25] of the form:

Cycle (t) = cos(2πft – offset)

Where the cosine argument is in radians, f=frequency, t=time, and the offset is used to align the sine wave with assumed cycle lows (cold periods) from Ilya Usoskin^[26] and Joan Feynman.^[27] For more on this transform, used in Fourier analysis, see David Evans’ paper here.^[28] These lows are not precise and must be estimated from the available data. The actual values used, and the precise functions are in the supplementary materials which are linked at the end of this post.

The Multiple Regression Model

I performed a number of regressions with the variables plotted in figure 3 and various subsets of them. In every case where I could tell, the statistically most important single variable, judging from AIC,^[29] sum of squares, and R², was the logarithm of CO₂. However, all the variables were significant, and CO₂ compared to the impact of all the others combined was small, as we will see. AIC ranks the input predictors for the 1958 case rank as follows: Log_CO₂, Nino_3_4, Hallstatt, Eddy, Pentadecadal, sunspots, and finally de Vries. AIC is based on the sum of squares, so it can be problematic in autocorrelated series^[30] like these. The plots below give you feel for the relative importance of the main variables, which is hard (maybe impossible) to calculate statistically with any precision, mainly due to the brief period of our instrumental data and the long periods of the important solar cycles. The next four plots are for the whole instrumental record, 1850 to 2023. Figure 4 includes all the variables in the study.

Figure 4. A model with all series, including log(CO₂). The fine gray line is the monthly HadCRUT5 data, and the blue line is smoothed with an 11-year moving average. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990. The orange line is the model.

Figure 5 uses all the variables except Log_CO₂. In both figures the blue line is the smoothed HadCRUT5 record, and the fine gray line is the monthly HadCRUT5 data. The orange line is the model. We can see that Log_CO₂ visually adds little to the match between observations and the model. Significant improvement is visible around 1940, otherwise the two models are about the same.

Figure 5. Plot of the regression with log(CO₂) removed from the list of predictors. The R² drops to 0.84 and there is noticeable deterioration in the fit between 1935 and 1947. The fine gray line and the blue line are as before. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Figure 6 compares the model that uses Log_CO₂ to the model that only uses the solar related variables. The two models are similar. The only noticeable differences are before 1940 when CO₂ was supposedly not very important. It is possible that the differences are due to data quality. As we will see, the data prior to 1958 was lower in quality than the data after that date.

Figure 6. A comparison of the “no CO₂” versus “with CO₂” models. All other predictors are in both models. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

In figure 7 we model HadCRUT5 with only CO₂. While the R² is 0.8 and the model generally follows HadCRUT5, the model lacks the granularity and detail that is apparent in figures 5 and 6. The IPCC calls the granularity natural variability and dismisses it as statistical “noise” that is random. Notice the P-value doesn’t change, the P-value is of little use in models like this that have a lot of observations and produce good matches. It is not a good measure of model quality.

Figure 7. HadCRUT5 is modeled with only CO₂. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Next, we repeat the above four plots using a new model that only uses the data between 1958 and the present day. This is the largest period possible with good data. To get another upward step change in data quality we would need to move to 2005 when the ARGO array became sufficiently large to produce better data on ocean temperatures than we can get from ships. But only 17 years of good ocean data is not long enough to judge the influence of the longer solar cycles.

Figure 8 shows a good visual match between observations and a model with all the variables. It also has an R² of 0.9, which would be impressive if the variables were independent and not autocorrelated. The mismatch between 1992 and 1995 is probably due to the Pinatubo eruption in 1991, which was not incorporated into this model.

Figure 8. A model with all predictors, including CO₂, from 1958 to the present day. The Pinatubo eruption is identified. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Figure 9. A model from 1958 with all predictors except CO₂. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Figure 9 is the model with all variables except for CO₂. The match is still good, but there are differences in detail suggesting that adding CO₂ makes a difference. The large difference just after 1992 is probably due to the influence of the Mt. Pinatubo eruption in the summer of 1991. The effect of the eruption lasted several years. With the exception of the Pinatubo eruption, the model is almost as good as the model that includes CO₂, at least visually.

Figure 10. Models with and without CO₂ over the 1958 to present period. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Figure 10 compares the models with and without CO₂ directly, and except for the period right around the Mt. Pinatubo eruption, the match is excellent. I’m not saying that Pinatubo had an effect before it erupted, just that the large impact of the eruption on the HadCRUT5 record (see Figure 11) could have distorted the two regressions differently in that period. Possibly the addition of CO₂ makes a small difference, but it isn’t apparent in this plot anywhere except around the eruption.

Figure 11. A model with only CO₂ as a predictor from 1958 to the present. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

Figure 11 shows a model using only the logarithm of CO₂, there is a general correspondence of temperature and CO₂, but a great deal of detail is missing that we see in the other models. We can argue that the variation of the HadCRUT5 record around the orange model in figure 11 is not random noise if it can be modeled with solar cycles.

A word on statistics

The risk in evaluating regression statistics of models of autocorrelated series is most easily seen by considering that any two monotonically increasing time series, for example CO₂ and temperature since 1850, will appear to correlate, even if they are unrelated. This is why I often hate statistics, too often statistical measures of fit, like R², or computed statistical probabilities are used to gaslight readers into believing something that isn’t true. Your first judgment of a correlation should be made with a plot of the data versus the model, second should be a plot of the residuals. Are the residuals evenly dispersed about zero, or do they have a trend? All the residual plots for the top models in this post are trendless, as they should be.

The main point is to trust your eyes, not statistical measures of the fits, they are secondary. Sometimes the obvious is correct. To illustrate this point, I used a stepwise regression to order the models. To generate these four models, I removed the top variable (according to its AIC) and reran the regression with the remaining variables until the visual model did not match HadCRUT5 very well. The procedure suggests the most important variables are Log_CO2, Hallstatt, and Eddy. The four acceptable stepwise regression models are plotted in figure 12.

Figure 12. The four best forward stepwise regression models. The vertical scale is the HadCRUT5 temperature anomaly in degrees C, relative to 1961-1990.

The first stepwise model (All) chose the variables listed in the figure. The variables are listed in order of importance according to their AIC scores. The best models, visually, are “All” and “no CO2,” and it is hard to tell the difference between the two. Notice that when CO2 was removed from the selection list, more variables were chosen.

After Hallstatt is removed, the list of chosen variables shrinks, but the model visually degraded a lot. Once Eddy was removed the model becomes very poor. The top variable, by AIC, is Log_CO₂, but when Log_CO₂ is removed from the model (the green curve) the match to HadCRUT5 is still good. Other models were also evaluated in this fashion, but these three are the best.

The variables that came out consistently on the bottom, according to AIC, were the Pentadecadal cycle and sunspots. However, removing these variables always caused the model to visually deteriorate unacceptably. Thus, AIC, while useful, is not a good sole criterion for the value of variables or models. Always look at the plots.

Conclusions

There are several logical conclusions from this study.

A successful model can be built using only solar cycles, ENSO, and the sunspot record.
Adding CO₂ to the model described in (1) above adds a little to the fit, mostly in short intervals, like from 1935 to 1940 and in the middle 1990s around the Pinatubo eruption.
Standard statistical measures, like AIC, R² or the P test, cannot be used as the sole measure of the success of the model. Evaluating the plots is critical.

This study shows that solar variability, at least statistically, correlates to HadCRUT5 at least as well as CO₂. Since HadCRUT5 is one of the main global average surface temperature records used by the IPCC to measure climate change, their conclusion, as stated in the AR6 Technical Summary is:

“Taken together with numerous formal attribution studies across an even broader range of indicators and theoretical understanding, this underpins the unequivocal attribution of observed warming of the atmosphere, ocean, and land to human influence.”
AR6 TS, page 63, emphasis added.

This is incorrect, and the result of their unsupported assertion that the Sun has no influence on climate. They should seriously investigate the influence of solar variability on climate change. I expected to have to deal with lagged solar effects on climate in this study going in. Possible multi-year lags between solar events and related climatic effects are mentioned in many papers (example here, other examples are cited in Eichler, et al.), but the observation/model matches in this post were all achieved with no lags.

I would like to thank Charley May and David Evans for their help with this post, although if there are any errors, they are mine alone.

Download the bibliography here.

Download the supplementary material here. You will find the R code to create all the models and Excel code to make the main models, not all the R models can be made in Excel. To run the models in Excel you will need to the “Analysis ToolPak” and the “Solver” Add-in. These are found under File/Options/Add-ins.

(IPCC, 2021, p. 961) ↑
See especially: (Connolly et al., 2021), (Hoyt & Schatten, 1997), (Soon, Connolly, & Connolly, 2015), (Usoskin I. , 2017), (Usoskin, Gallet, Lopes, Kovaltsov, & Hulot, 2016), (Scafetta N. , 2023), (Vahrenholt & Lüning, 2015), and (Judge, Egeland, & Henry, 2020). ↑
(Hoyt & Schatten, 1997) and (Bray, 1968) ↑
¹⁴C is the Carbon-14 isotope, except for nuclear bombs it is only created in the atmosphere by galactic cosmic rays, which increase when the Sun is less active. It has been used a proxy for solar activity for many decades. It is stored in tree rings, which provide a convenient and accurate date for each ¹⁴C concentration. (Cain & Suess, 1976) and (Cain W. , 1975). ↑
¹⁰Be is an isotope of Beryllium that is created by cosmic rays and is also inversely correlated with solar activity. It is stored in ice cores. (Beer, Blinov, Bonani, & al., 1990). ↑
(Beer, Blinov, Bonani, & al., 1990) and (Hoyt & Schatten, 1997, p. 174) ↑
(Bray, 1968) ↑
(Delaygue & Bard, 2011) ↑
https://www.sidc.be/SILSO/datafiles ↑
(Bray, 1968) ↑
(Abreu, Beer, & Ferriz-Mas, 2010) ↑
Joan Feynman studied this centennial cycle and the pentadecadal cycle for many years. She called it the Gleissberg cycle, but since many have used the name Feynman cycle, we continue with that name here (Feynman & Ruzmaikin, 2014). See also (Peristykh & Damon, 2003). ↑
The Pentadecadal cycle was first recognized by Rudolf Wolf in 1862 (Peristykh & Damon, 2003). He recognized that two or three high cycles were often followed by two or three low cycles. More formal recognition of the cycle was made by (Feynman & Ruzmaikin, 2014) and (Clilverd, Clarke, Ulich, Rishbeth, & Jarvis, 2006). ↑
(Peristykh & Damon, 2003) ↑
(Judge, Egeland, & Henry, 2020) and (Baliunas, et al., 1995) ↑
(Peristykh & Damon, 2003) ↑
(Roy, 2014) ↑
https://www.sidc.be/SILSO/datafiles ↑
https://climexp.knmi.nl/getindices.cgi?WMO=NCDCData/ersst_nino3.4a&STATION=NINO3.4&TYPE=i&id=someone@somewhere ↑
https://www.metoffice.gov.uk/hadobs/hadcrut5/ ↑
1958 was the International Geophysical Year (IGY), which led to gathering much higher quality climate and climate-related data. It is notable that the late S. Fred Singer was one of the organizers of this project and that it was organized in James Van Allen’s living room in 1950. According to Van Allen, it was his wife’s (Abigail) chocolate cake that sealed the deal that day. (Korsmo, 2007). ↑
(McKitrick & Christy, 2018) and (McKitrick & Christy, 2020) ↑
CO2 concentration varies as the logarithm to the base 2 with temperature, which means as CO2 doubles, temperature increases linearly. As CO2 concentration increases, its effect on surface temperature decreases. (Romps, Seeley, & Edman, 2022) and (Wijngaarden & Happer, 2020) ↑
ENSO 3.4 is from ERSST, which only goes back to 1854. 1850 through 1853 are filled in with the Webb, 2022 ONI. The sunspot number is from SILSO, and the CO2 concentration data are from NASA and NOAA. The CO2 record is interpolated yearly averages to avoid the seasonal changes. ↑
(Evans, 2013) ↑
(Usoskin, Gallet, Lopes, Kovaltsov, & Hulot, 2016) and (Usoskin I. , 2017) ↑
(Feynman & Ruzmaikin, 2014) ↑
(Evans, 2013) ↑
AIC stands for Akaike Information Criterion. It estimates the information lost by using the regression model in the place of the measurements. Like R², it is based on the sum of squares and is susceptible to inflation (making variables and models look better than they actually are) due to autocorrelation. The Wikipedia article on this metric is helpful, see here. The lower the AIC value the better the model. ↑
All the input time series used in these multiple regression models are autocorrelated, which simply means each value in the series is highly dependent on its previous values not independent of one another as required by the rules of regression. This artificially inflates the statistical measures often used to evaluate the quality of a regression, such as the R² value shown in some of the plots. ↑

5 27 votes

Article Rating

157 Comments

Inline Feedbacks

View all comments

See - owe to Rich

November 18, 2023 4:55 am

I am with Bellman on this one. I have less detailed analysis than him (her?), but arguably a more succinct point. It is to do with “confounding”, where two variables might have exactly the same effect (true confounding) or just a somewhat similar effect. The variables I would point to here are CO2 and the Bray cycle, which are both close to straight lines over the 1958-present period. To be confident of a model, you need to show it going up and down with the data, but the data aren’t going up and down in this period! At least in my paper I used HadCRUT4 from 1850-2006 which did involve some ups and downs: On the Influence of Solar Cycle Lengths and Carbon Dioxide on Global Temperatures, Journal of Atmospheric and Solar-Terrestrial Physics 173 (2018), https://doi.org/10.1016/j.jastp.2018.01.026
or https://github.com/rjbooth88/hello-climate/files/1835197/s-co2-paper-correct.docx .

So, over that shorter period, I think you could replace either CO2 or the Bray Cycle with any other fairly straight line and get as good a fit. For example, you could use your age in each of the years 1958-2023! The globe is warming because you have been getting older! Perhaps when you die it will stop warming… But in any case I wish you a long life : – )

Andy May

Author

Reply to See - owe to Rich

November 18, 2023 1:37 pm

See, I agree with you, and I agree that Bellman has a valid point. CO2 is a non-entity statistically, any straight upward sloping line can replace it, either Eddy or Hallstatt works fine.

This is a point that is often lost, but I’m preparing another post that will try and hammer it home.

Bellman also asks about offsets for the sine waves. I need to look into that, but I haven’t yet.

On a higher level, our real problem is the short instrumental record and the very long solar cycles. I doubt we can be definitive; the record is too short.

The key point? Anything can replace CO2.

See - owe to Rich

Reply to Andy May

November 20, 2023 2:08 am

Yes, anything linear can replace CO2 to get a good fit. But CO2 has a plausible physical causative action on global temperature, whereas most other things don’t (including your age). Yes, the sun also has a great causative action, but you would need to demonstrate statistics showing the sun is actually on a substantively upward trend in line with the Bray Cycle.

Tom Abbott

November 18, 2023 6:16 am

From the article: “In figure 7 we model HadCRUT5 with only CO2”

I would like to see a model comparing CO2 to temperatures using Hansen 1999:

Obviously, CO2 increases do not correlate with the U.S. temperature chart. CO2 rises continuously, while the U.S. temperatures warm and cool significantly over decades.

According to CO2 disaster theory, if CO2 increases, the temperatures increase, but that’s not happening in the United States.

How do we explain this discrepancy?

My explanation is the IPCC is using a bogus, bastardized Hockey Stick chart to do their comparisons. If they used a real temperature chart, like the U.S. regional chart, one that accurately showed the temperatures, they would find that CO2 appears to be irrelevant to temperatures.

If you see a temperature chart that does not show the Early Twentieth Century as being as warm as today, then you are looking at a bogus, bastardized Hockey Stick chart that was created specifically to correlate temperatures with CO2 levels, as a means of selling the Catastrophic Anthropogenic Global Warming (CAGW) scaremongering.

The bogus Hockey Stick temperature profile is the BIG LIE of alarmist climate science. It does not represent reality. It’s temperature profile looks nothing like the temperature profiles of historic, written, regional temperature records from all around the world.

It’s a device to sell a lie. And it’s all the climate change alarmists have to point to as “evidence” that CO2 is dangerous. And it’s all made up out of whole cloth in corrupt data manipulator’s computers.

No Hockey Stick chart = No basis for CO2 scaremongering.

Nepal2

November 18, 2023 5:12 pm

Andy,

I’m not able to look at your code right now, can you say how many fitting parameters are used in these models?

I’m a bit suspicious that there are so many fitting parameters that you can fit the temperature with literally any input. In which case of course it wouldn’t matter if you removed CO2.

Andy May

Author

Reply to Nepal2

November 19, 2023 4:06 am

Nepal2,
I constructed several models and all had different numbers of time series as input. I didn’t use any parameters if I understand your meaning. Currently I’m trying to eliminate time series from the regression to get down to the minimum required. My latest suitable models have the following series as input:

Model with CO2:
HadCRUT5 ~ Log_CO2 + Eddy + de.Vries + Nino_3_4 + Hale

Model replacing CO2 with Hallstatt, with almost the same fit:
HadCRUT5 ~ Hallstatt + Eddy + de.Vries + Nino_3_4 + Hale

Hale is new, it is a very well documented 22.14-year solar cycle. Oddly sunspots did not make the cut, not sure why. Maybe Nino_3_4 captures the important part of Schwabe cycle.

These seem to be the key series and Log_CO2 and Hallstatt are interchangeable. Don’t worry about overfitting, no model produced will be evidence for anything. The time period of our data is too short and the solar cycles are too long. The key point here is that the longer solar cycles can replace CO2 in the model and give the same result.

Bellman is concerned that random time series can model HadCRUT5. An interesting point, if true, I will look into that. I’m using documented time series now.

Anyway, I appreciate all the help I’ve received in these comments from everyone. Very good discussion.

Nepal2

Reply to Andy May

November 19, 2023 6:04 am

Right, my thinking is similar to bellman. If you use an extremely flexible model, a good fit doesn’t tell you anything about reality. It just affirms that your model is very flexible.

dh-mtl

November 19, 2023 4:49 pm

Andy May,

Thanks for this article and the work that you have done to characterize global temperatures using multiple regression analysis.

However, I would like to point out that your model fails to capture the true significance of Nino 3.4:

To correctly treat Nino_3.4, (and other variables) you must firstly consider the time lags between cause and effect.

In order to determine the characteristic time lags, I have used the exponentially weighted moving average (EWMA) of Nino 3.4:, where:

EWMA (t) = λ * Nino 3.4(t) + (1-λ) * EWMA (t-1).

I have then calculated the correlation coefficient (R) between the EWMA of Nino 3.4 and global temperature databases (for example HadCrut4 and UAH, monthly data) for various values of λ to determine for which values of λ result in a maximum of the correlation coefficient. The maxima in the correlation coefficient is indicative of the time lag between Nino 3.4 and its effect on global temperatures.

Secondly, you must consider the mechanism(s) by which Nino 3.4 affects global temperatures. It turns out that when looking at the correlation coefficient R (Nino 3.4 / Hadcrut4) as a function of λ, the function is bi-modal. There is a maximum at λ ~ 0.12 (equivalent to about an 8 month moving average), and another at λ < 0.01 (equivalent to very long moving averages of greater than 100 months).

This bi-modality reflects the fact that Nino 3.4 affects global temperatures through two distinct mechanisms. The first is a convection mode, i.e direct transfer of energy into the atmosphere in the Nino region by evaporation and its subsequent distribution throughout the atmosphere by convection. This leads to the well-known approximately 4 month lag (i.e. 8 month moving average) between Nino 3.4 and its effect on global temperatures.

The second mode is advection, i.e the transport of the Nino 3.4 waters throughout the oceans by ocean currents. This mode, which is at least an order of magnitude slower than convection, with λ of < 0.01 equal to about a > 100 month moving average.

The advection mode has been noted by Simon Tisdale who called it a ‘permanent effect’ of Nino 3.4, by Eschenbach (‘Adding it Up’, WUWT 2021-04-09) and Stockwell and Cox (JOURNAL OF GEOPHYSICAL RESEARCH, VOL. ???, XXXX, DOI:10.1029/). Both Stockwell and Cox and Eschenbach used a CUSUM function of Nino-3.4 to correlate with temperatures. It should be noted that the CUSUM function carries essentially the same information as the EWMA with a small value of λ.

This second mode, advection, is found to have a much more significant effect on global temperatures than the convection mode, albeit with a much greater time lag. Given the time lags, the advection mode would not be visible at all in your treatment of Nino 3.4.

Summary:

In order to fully consider Nino 3.4 in a multiple regression analysis it is necessary to:

consider separately the modes by which Nino 3.4 affect global temperatures, convection and advection.

properly address the time lags between Nino 3.4 and its effects on the global temperatures, of the order of a few months for convection, and of the order of a decade or so for advection.

The best way to do the above is:

to characterize the convection mode using an EWMA of Nino 3.4 with a value of λ = ~ 1/8

to characterize the advection mode with an EWMA of Nino 3.4 with a λ < 0.01, from which the short term variations have been removed (for example by subtracting from it the EWMA used to characterize the convection mode).

If you treat Nino 3.4 as suggested above, splitting it into two functions, one for each mode, and properly accounting for the time lags, you will find that it has a much stronger correlation with global temperatures than your work suggests.

Ulric Lyons

November 20, 2023 11:37 am

“Studies of 14C, 10Be, and sunspot records have uncovered four major long-term solar cycles. These are the Hallstatt (or Bray) cycle of about 2,400 years,[10] the Eddy Cycle of about 1,000 years,[11] the de Vries (or Suess) cycle of about 210 years, Feynman (or Gleissberg) cycle of about 105 years,[12] and the Pentadecadal cycle of about 50 years.[13] All the cycle periods are approximate, further, they may vary over geological time.[14] Some may not like my use of the term “cycles,” since our understanding of the cycle periods and the strength or power of each cycle is poor.”

The astronomical mean for grand solar minima series is 863.311 years, so the Eddy cycle is too long. The Gleissberg cycle of centennial solar minima is highly variable, from 7 to 12 sunspot cycles long, with a long term mean length of 107.9 years. The AMO has a long term mean frequency of 54 years as every other warm phase is during a centennial solar minimum.

The Lyons cycle:

https://docs.google.com/document/d/1YOu7hHVEuaWWLuztj6ThEsJd7Z-765Uz-L68lQbRdbQ/edit

« Previous 1 2

wpDiscuz

Welcome to Watts Up With That, one of the most well-known climate blogs! We gather the latest scientific research, news, and expert opinion to help you understand how our planet is changing and what implications it may have for humanity. Our approach is based on facts, objective analysis, and open discussions about one of the most critical issues of our time. Watts up with that climate and what changes await us – let’s figure it out together!

Watts Up With That covers a wide range of topics related to climate change and its impact on the world. Here’s what’s important to us:

Global warming – its causes, consequences, and future forecasts.
Analysis of current climate research and its findings.
Climate change news.
Extreme weather events – hurricanes, droughts, floods, and their connection to climate change.
The impact of different energy sources on the environment and the development of sustainable technologies.
Political and economic aspects and how states and international organizations respond to climate change.

Watts Up With That?

Modeling HadCRUT5 with CO2 and without CO2

Like this:

Get notified when a new post is published.

Share this:

Like this:

Related Posts

British Intelligence Goes Full Guardian Promoting Untestable Computer-Generated Scares of Eco-System Collapse

Ross McKitrick on Climate Models, Economic Impacts, and the DOE Report

Tropical Tropospheric Temperature Trends, 1979-2025: The Epic Climate Model Failure Continues

Surface Air Temperature Trends, Climate Models vs Observations, 1979-2025