By Andy May
In my last post I plotted the NASA CO2 and the HadCRUT5 records from 1850 to 2020 and compared them. This was in response to a plot posted on twitter by Robert Rohde implying they correlated well. The two records appear to correlate because the resulting R2 is 0.87. The least square’s function used made the global temperature anomaly a function of the logarithm to the base 2 of the CO2 concentration (or ‘log2CO2‘). This means the temperature change was assumed to be linear with the doubling of the CO2 concentration, a common assumption. The least squares (or ‘LS’) methodology assumes there is no error in the measurements of the CO2 concentration and all error resulting from the correlation (the residuals) resides in the HadCRUT5 global average surface temperature estimates.
In the comments to the previous post, it became clear that some readers understood the computed R2 (often called the coefficient of determination), from LS, was artificially inflated because both X (log2CO2) and Y (HadCRUT5) were autocorrelated and increased with time. But a few did not understand this vital point. As most investors, engineers, and geoscientists know, two time series that are both autocorrelated and increase with time will almost always have an inflated R2. This is one type of “spurious correlation.” In other words, the high R2 does not necessarily mean the variables are related to one another. Autocorrelation is a big deal in time series analysis and in climate science, but too frequently ignored. To judge any correlation between CO2 and HadCRUT5 we must look for autocorrelation effects. The most tool used is the Durbin-Watson statistic.
The Durbin-Watson statistic tests the null hypothesis that the residuals from a LS regression are not autocorrelated against the alternative that they are. The statistic is a number between 0 and 4, a value of 2 indicates non-autocorrelation and a value < 2 suggests positive autocorrelation and a value >2 suggests negative autocorrelation. Since the computation of R2 assumes that each observation is independent of the others, we hope that we get a value of 2, that way the R2 is valid. If the regression residuals are autocorrelated and not random—that is normally distributed about the mean—the R2 is invalid and too high. In the statistical program R, this is done—using a linear fit—with only one statement, as shown below:
This R program reads in the HadCRUT5 anomalies and the log2CO2 values from 1850-2020 plotted in Figure 1, then loads the R library that contains the durbinWatsonTest function and runs the function. I only supply the function with one argument, the output from the R linear regression function lm. In this case we ask lm to compute a linear fit of HadCRUT5, as a function of log2CO2. The Durbin-Watson (DW) function reads the lm output and computes the DW statistic of 0.8 from the residuals of the linear fit by comparing them to themselves with a lag of one year.
The DW statistic is significantly less than 2 suggesting positive autocorrelation. The p value is zero, which means the null hypothesis that the HadCRUT5-log2CO2 linear fit residuals are not autocorrelated is false. That is, they are likely autocorrelated. R makes it easy to do the calculation, but it is unsatisfying since we don’t get much understanding from running it or from the output. So, let’s do the same calculation with Excel and go through all the gory details.
The Gory Details
The basic data used is shown in Figure 1, it is the same as Figure 2 in the previous post.
Strictly speaking, autocorrelation refers to how a time series correlates with itself with a time lag. Visually we can see that both curves in Figure 1 are autocorrelated, like most times series. What this means is that a large part of each value is determined by its preceding value. Thus, the log2CO2 value in 1980 is very dependent upon the value in 1979, and this is also true of the 1980 and 1979 values of HadCRUT5. This a critical point since all LS fits assume that the observations used are independent and that the residuals between the observations and the predicted values are random and normally distributed. R2 is not valid if the observations are not independent, a lack of independence will be visible in the regression residuals. Below is a table of autocorrelation coefficients for the curves in Figure 1 for time lags of one to eight years.
The autocorrelation values in Table 1 are computed with the Excel formula found here. The autocorrelation coefficients shown, like conventional correlation coefficients, vary from -1 (negative correlation) to +1 (positive correlation). As you can see in the Table both HadCRUT5 and log2CO2 are strongly positively autocorrelated, that is they are monotonically increasing, as we can confirm with a glance at Figure 1. The autocorrelation decreases with increasing lag, which is normally the case. All that means is that this year’s average temperature is more closely related to last year’s temperature than the year before and so on.
Row number one of Table 1 tells us that about 76% of each HadCRUT5 temperature and over 90% of each NASA CO2 concentration are dependent upon the previous year’s value. Thus, in both cases, each yearly value is not independent.
While the numbers given above apply to the individual curves in Figure 1, autocorrelation can clearly affect the regression statistics when the temperature and CO2 curves are regressed against one another. This bivariate autocorrelation is usually examined using the Durbin-Watson statistic mentioned above, and named for James Durbin and Geoffrey Watson.
As I did in the R program above, traditionally the Durbin-Watson calculation is performed on a linear regression of the two variables of interest. Figure 2 is like Figure 1, but we have fit LS lines to both HadCRUT5 and Log2CO2.
In Figure 2, orange is log2CO2 and blue is HadCRUT5. The residuals are shown in Figure 3, notice they are not random and appear to autocorrelate as we would expect from the statistics given in Table 1. They are autocorrelated and have the same shape, which is worrying.
The next step in the DW process is to derive a LS fit to the residuals shown in Figure 3, this is done in Figure 4.
Just as we feared, the residuals correlate and have a positive slope. Doing the DW calculations in this fashion, we get a DW statistic of 0.84, close to the value computed in R, but not exactly the same. I suspect that this is because the multiple sum-of-squares computations over 170 years of data leads to the subtle difference of 0.04. We can confirm this by performing the R calculation using the Excel residuals:
This confirms that both calculations match, but there were differences in the sum-of-squares calculations due to the different computer floating-point precision used in Excel and R. So, with a linear fit to both HadCRUT5 and log2CO2, there are serious autocorrelation problems. But both are concave upwards patterns, what if we used an LS fit that is more appropriate that a line? The plots look like a second order polynomial, let’s try that.
Figure 5 shows the same data as in Figure 1, but we have fit 2nd order polynomials to each of the curves. The CO2 and HadCRUT5 data curve upward, so this is a big improvement over the linear fits above.
I should mention that I did not use the equations on the plot for the calculations, I did a separate fit to decades. The decades were calculated using 1850 as zero and 1850 to 1860 as decimal decades and so on to 2020 so that the X variable in the calculation had smaller values in the sum of squares calculations. This is to get around the Excel computer floating-point precision problem already mentioned.
The next step in the process is to subtract the predicted or trend value for each year from the actual value to create residuals. This is done for both curves, the residuals are plotted in Figure 6.
Figure 6 shows us that the residuals of the polynomial fits to HadCRUT5 and log2CO2 still have structure and the structure visually correlates, not a good sign. This is the portion of the correlation left, after the 2nd order fit is removed. In Figure 7 I fit a linear trend to the residuals. The R2 is less than in Figure 4
There is still a signal in the data. It is positive, suggesting that if the autocorrelation were truly removed with the 2nd order fit (we cannot say that statistically, but “what if”), there is still a small positive change in temperature, as CO2 increases. Remember, autocorrelation doesn’t say there is no correlation, it just invalidates the correlation statistics. If temperature is mostly dependent upon the previous year’s temperature, and we can successfully remove that influence, what remains is the real dependency of temperature on CO2. Unfortunately, we can never be sure we removed the autocorrelation and can only speculate that Figure 7 may be the true dependency between temperature and CO2.
The Durbin-Watson statistic
Now the calculations to compute the Durbin-Watson joint autocorrelation are done, but this time we used a 2nd order polynomial regression. Below is a table showing the Durbin-Watson statistic between HadCRUT5 and log2CO2 for a lag of one year. The calculations were done using the procedure described here.
The Durbin-Watson value of 0.9, for a one-year lag, confirms what we saw visually in Figures 5 and 6. The residuals are still autocorrelated, even after removing the second order trend. The remaining correlation is positive, as we would expect, presumably meaning that CO2 has some small influence on temperature. We can confirm this calculation in R:
The R2 that results from a LS fit of CO2 concentration and global average temperatures is artificially inflated because both CO2 and temperature are autocorrelated time series that increase with time. Thus, in this case, R2 is an inappropriate statistic. R2 assumes that each observation is independent and we find that 76% of each year’s average global temperature is determined by the previous year’s temperature, leaving little to be influenced by CO2. Further, 90% of each year’s CO2 measurement is determined by the previous year’s value.
I concluded that the best function for removing the autocorrelation was a 2nd order polynomial, but even when this trend is removed, the residuals are still autocorrelated and the null hypothesis that they were not had to be rejected. It is disappointing that Robert Rohde, a PhD, would send out a plot of a correlation of CO2 and global average temperature implying that the correlation between them was meaningful without further explanation (as we showed in Figure 1 of the previous post) but he did.
Jamal Munshi did a similar analysis to ours in a paper in 2018 (Munshi, 2018). He notes that the consensus idea that increasing emissions of CO2 cause warming, and that the warming is linear with the doubling of CO2 (Log base 2) is a testable hypothesis. This hypothesis has not tested well because the uncertainty in the estimate of the warming due to CO2 (climate sensitivity) has remained stubbornly large for over forty years, basically ±50%. This has caused the consensus to try and move away from climate sensitivity toward comparisons of warming to aggregate carbon dioxide emissions, thinking they can get a narrower and more valid correlation with warming. Munshi continues:
“This state of affairs in climate sensitivity research is likely the result of insufficient statistical rigor in the research methodologies applied. This work demonstrates spurious proportionalities in time series data that may yield climate sensitivities that have no interpretation. … [Munshi’s] results imply that the large number of climate sensitivities reported in the literature are likely to be mostly spurious. … Sufficient statistical discipline is likely to settle the … climate sensitivity issue one way or the other, either to determine its hitherto elusive value or to demonstrate that the assumed relationships do not exist in the data.”(Munshi, 2018)
While we used CO2 concentration in this post, many in the “consensus” are now using total fossil fuel emissions in their work, thinking that it is a more statistically valid quantity to compare to temperature. It isn’t, the problems remain, and in some ways are worse, as explained by Munshi in a separate paper (Munshi, 2018b). I agree with Munshi that statistical rigor is sorely lacking in the climate community. The community all too often use statistics to obscure their lack of data and statistical significance, rather than to inform.
The R code and Excel spreadsheet used to perform all the calculations in this post can be downloaded here.
Key words: Durbin-Watson, R, autocorrelation, spurious correlation
Munshi, J. (2018). The Charney Sensitivity of Homicides to Atmospheric CO2: A Parody. SSRN. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3162520
Munshi, J. (2018b). From Equilibrium Climate Sensitivity to Carbon Climate Response. SSRN. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3142525
Lo and behold, this is why we have calculus.
Well, in this case not so much calculus, but instead statistical analyses.
And this is why we need mathematicians (and coders of R) of the caliber of Andy May.
Misuse and abuse of statistical analysis in some effort to demonstrate a preconceived idea that temp and C02 do not relate?
Don’t get me wrong, it’s wise to recognize the integral nature of the data. This is almost always ignored. Call it what you will, serial autocorrelation is one way to label it. In the end what you’re interested to is compare the rates of change. This is precisely what calculus is meant to handle. Calculus and statistics are not separate matters. Calculus and statistic both center on models of relationships. On this occasion adding a derivative term to a statistical model is the simplest way to treat this data. Always find the simplest way to treat the data. Anyone can mangle the data to suit an objective. Sometimes it takes courage to accept what’s in the data and stop.
Here’s an example. It’s just for fun and discussion.
Suppose we are interested to understand the rate of hair growth on someone’s head vs calories eaten by the subject daily.
In our first trial we will slowly increase the daily calorie intake daily for 1 month from slightly less than normal to slightly higher than normal intake.
Our hypothesis is that hair growth rate will increase with calorie intake. The person starts with their regular hair length and lets it go for the duration of the trial.
We execute the trial and the observations show that the hair has increased in length over the duration of the trial.
For a reasonable accounting in our statistics I suppose we need to account for the fact that hair length always increases with time. And we also know that we are increasing calorie intake with time. This can be called autocorrelation.
Furthermore, For any data point the length of hair is likely to be more similar in length to it’s preceding and subsequent readings than any other. This can be called serial autocorrelation.
But in this case we know that each data point is not independent. We know that total hair length builds on existing hair length. We are not resetting hair length daily. Each point is not independent. This is given.
The natural way to account for this is to take the derivative of hair length over a time interval. This will give us a rate of hair growth change at any time interval. In this case we take a weekly average derivative. This will give us hair growth rate in each week. Four data points.
This will remove any spurious correlations with time we might find. It’s the natural way to handle this problem. It’s intuitive and straightforward. It’s what people would do without even thinking much about it.
To test our hypothesis we will compare average weekly calorie intake vs weekly hair growth rate (derivative).
This is a physically meaningful way to compare against calorie intake in this trial. We’ll be able to see if the rate of hair growth changes along with weekly average calorie intake. Simple and straightforward.
Maybe in this case we find no statistical fit to average calorie intake and hair growth rate (derivative). So be it, and that is the finding.
An alternative way would be to choose advanced statistical techniques that deal with serial autocorrelation However, there would be no reason to do so.
A more appropriate way to implement autocorrelation techniques is in error analysis. Perhaps we are interested to test if measurement errors in our data are propagating or not. This we cannot know for sure, so it makes sense to test for it. We want to test if the errors are building over time or if the errors are independent.
We may also want to test if the errors are correlated with other variables. We might want to account for changes in average humidity in our hair growth rate experiment that might be impacting the results. This is where analysis of residuals in a linear model of hair growth rate vs calorie intake can be useful. We can try correlating the residuals to other factors to gain insights.
If this error approach is the intent of the article then it is interesting absolutely. However, it doesn’t tell us much useful about the nature of the relationship between temperature and CO2. It might only tell a bit about what’s going on at the fringes of error and noise which could be interesting.
In jumping to conclusions what you all seem to be missing is that there very well may be useful signals in the residuals you have found. However, you must first look for and acknowledge any obvious and overwhelming signals in a basic data exploration. A valid analysis of residuals could contain information on any number of interesting factors, from CO2 variations unrelated to temperature – to a possible source of measurement error in one or the other data – to invalid ‘adjustments’ applied to one data set or the other, for example. You won’t see this with a closed mind. There are a lot of keys in here that are being overlooked. The statistical techniques in the article are sound but they are being misapplied and misrepresent the information.
Seeing the number of replies to yourself, you should not overlook the mathematical subject of recursion. 🙂
I suggest you put the gift of your wit towards advancing the discussion. I see your attacks but they are not persuasive or useful.
Very interesting, thanks. I do intend to write more on this subject, this was just meant as an introductory post to introduce the concept of residual analysis and autocorrelation. There are more advanced techniques that we should dive into, that unfortunately are ignored by the “consensus.”
When using a line to connect discrete, real-world data points, the line is most often discontinuous at each data point. Differential calculus is built on the basis of properties of continuous functions, not discontinuous ones.
In more obvious terms, the rate-of-change of parameter x with time (dx/dt) is indeterminate at all discontinuous points in an ensemble of x versus time.
One more observation: one successfully performs a least-squares linear regression analysis (and other forms of curve-fitting) on an ensemble of data using statistical algorithms that do not require any aspect of either differential or integral calculus. Therefore, calculus and statistics can indeed be separate matters.
It’s an interesting theoretical nit pik you propose. Is this why you won’t directly address the apparent correlation between temperature and CO2 derivative? That would require considerable faith in your thesis on your part. I wasn’t aware this is the basis of your argument. From my vantage point it seems like some effort is being made to avoid this observation by instead focusing on pedantic details and an ongoing undermining attempt. If there is a valid argument I want to hear it.
Quite simply: it is YOU, not me, that chose to go down the rabbit hole of introducing and then conflating calculus with statistics.
In regards to calling the calculus that is based on continuous functions and its inability to deal with discontinuous points/functions a “nit pick”: well, I regret that I cannot help you with that, but any elementary introduction to calculus probably can.
BTW, it was Andy May, not me, that wrote the above article showing the various statistical problems when attempting to correlate atmospheric CO2 concentration with global lower atmospheric temperature, given that both are time varying data sets. I suggest you post your arguments about this (anti)correlation to him, not me.
And I never advanced a thesis.
Good luck to you.
Smugness might win you fans but it won’t go far with me. The sampling interval is irrelevant to the nature of the function that we are modelling. If there are discontinuities in atmospheric CO2 that is news to me.
There are obvious discontinuities in the temperature versus time series of data points, and the whole point of the above article was addressing the difficulty of accurately correlating global atmospheric temperature with global atmospheric CO2 concentration.
It only seems difficult when you specifically avoid the very simple methods developed to deal with this kind of data.
I eagerly look forward to your WUWT article on your simplified methods of dealing with the issues that Andy May revealed in his above article.
Correction: “I eagerly look forward to [perpetuating some trivial quarrel I have with you]”.
There is a discontinuity every time winter causes the green grass under the measuring station to change to brown. There is a discontinuity every time the brown grass turns green in the spring. There is a discontinuity every time snow accumulates on the ground below the station. There is a discontinuity every time a tree grows up and shades the measuring station.
Yeah, life’s a bitch, isn’t it.
It has nothing whatever to do with calculus. I said of the original analysis that it proved nothing because any two time series increasing in a linear fashion will show strong correlations — the fact that CO2 concentration is a polynomial is irrelevant because the log of that is, of course, also linear. That the residuals clearly aren’t random also means the results should be ignored.
The really worrying thing here isn’t so much that many people don’t understand this — statistics aren’t easy for people to understand intuitively — but that so many SCIENTISTS don’t understand statistics. This, of course, includes climate scientists. It also means that journalists can be very easily fooled into believing relationships exist when they don’t.
It’s a damning indictment of climate science that many climate scientists don’t do enough to take account of the statistical issues involved in their analyses. All too often I see absurd claims being made because a certain ‘statistic’ is interpreted at face value without due attention being given as to how or why it might have been derived.
Of course, climate science is far from alone in this. Medical science is every bit as guilty. That, however, is no excuse. Either way this is very bad science, full-stop (or ‘period’ as I believe an American would say).
Perhaps I have derailed the conversation into arguing the semantics of calculus. In my view all statistical proofs are completely based in calculus whether the user of its algebraic notations understands this or not. but the semantics are irrelevant. That is on me. However, choosing a false analytical method, such as in this article, does not provide any evidence towards a correlation or not between CO2 and temperature.
Statistics is a bitch, but valuable in avoiding silly mistakes. And equally useful in making silly mistakes, which is why it is a bitch.
“I concluded that the best function for removing the autocorrelation was a 2nd order polynomial”
This stuff is all wrong. Choice of polynomial does not remove autocorrelation – it is a function of the data, not the fit. Using the DW test to decide if autocorrelation exists is pointless – it almost certainly does.
Climate Audit treated the subject much more perceptively, eg here. The point of a/c is that it makes trends less significant, and reduces correlation estimates. The proper thing to do is to just calculate the reduced values, eg by the Quenouille correction mentioned by CA. It can make a substantial difference, but that is what you need to know. Better methods can be found here.
“we find that 76% of each year’s average global temperature is determined by the previous year’s temperature, leaving little to be influenced by CO2.”
This makes no sense. Autocorrelation is about changes from reading to reading. What is important is the effect of CO2 on the trend, not on the fluctuations.
Precisely my first point.
My second point.
The rest of what you wrote is best described as a difference in opinion. Detrending doesn’t necessarily remove a bias, but is often used for that purpose. Interpreting R^2 as the amount of variability explained is not accurate, but again, it is the conventional interpretation. This is why this stuff is in the discussion section. We can look deeper into these interpretations of the statistics in further posts.
We can differ on opinions and interpretation, as long as we agree on the facts, and it seems we do.
As for more advanced methods of dealing with a/c, we’ll leave that for further posts. I was shocked that a/c was so poorly understood. It is very important in climate science, yet frequently ignored. Climate scientists seem very ignorant of this important area of statistics. The public needs to understand it, even if they don’t, I thought I would start with the abcs. ~2,000 words at a time, no more. Thanks for the links.
“The correlation does not reflect the impact of CO2 on the warming trend.”
It is just the wrong place to look. Correlation largely reflects the matching of fluctuations. It is not a good way to test trends, and in fact the series should have been detrended before correlation analysis, to make them stationary (as best possible).
How much of the Earth’s atmosphere is made up of C02?
What is the amount of C02 in the atmosphere you want to achieve?
Nick, Agreed. There is a lot more I can do to examine this correlation and try and dissect it. I plan on writing more posts. The various CO2 to global temperature anomaly plots and correlations out there need to be explained properly, shown for what they are and what they are not.
As you say, a statistical correlation is definitely the wrong place to look. Get back to physics and ignore these cartoon plots, just my opinion.
It is interesting to plot rates of CO2 monthly changes year over year compared to monthly temp anomalies. As Dr. Humlum has shown, the strong correlation points to temps driving CO2, not the other way around. For example:
Further confirmation comes from the ability to predict CO2 concentrations using the formula:
CO2 this month this year = a + b × Temp this month this year + CO2 this month last year
a and b are constants for scaling to match observations. Thus:
Full discussion is at:
Thank you, Nick.
I have long worried about the validity of correlations between transformed data. I’ve occasionally used it when there is a skewed distribution and when the measurement is via an instrument 5that reads out in a log manner, like a pH meter. Never really liked doing that and do not like it now. Geoff S
“The point of a/c is that it makes trends less significant, and reduces correlation estimates.”
Well, at least you state the major point – let us all remember that, since it was the publication of material by Robert Rohde that did not acknowledge this significant matter that kicked off this posting by Andy May.
Climate science has not covered itself in glory when it comes to the correct use of statistics, quite the reverse. Let us all agree that it is important to get statistics right rather than rushing to headlines.
I don’t believe that autocorrelation is even an issue here and the idea that it’s even possible to measure “…76% of each year’s average global temperature…” to such a level of accuracy is totally and utterly absurd. The fundamental problem is that you’re looking at data sets that are always going to correlate and trying to find ways of torturing the data to provide some sort of ‘meaningful’ result is just ridiculous.
You only have to look at what’s been happening and said at COP to understand just how daft this has all become. People talk about “keeping the limit to 1.5 degrees C” as though this is all beautiful and deterministic when it’s the complete opposite.
One thing I can tell you with 100% certainty is that nobody actually has a clue. Ultimately, statistics is remarkably simple in that it can be boiled down to the probability that something is happening by chance or not. Fundamentally, that is what we’re talking about here; and the plain truth is that the data isn’t remotely accurate enough to provide an answer no matter how hard anyone tries to claim otherwise.
Errors are the elephant in the room for climate science — just ask why you NEVER see confidence intervals quoted for any climate-related predictions.
Hmmmmm . . . I do believe the IPCC Assessment Reports do make plentiful use of confidence intervals when stating their predictions:
“A level of confidence is expressed using five qualifiers “very low,” “low,” “medium,” “high,” and “very high.”— source: Paragraph 9, THE IPCC FIFTH ASSESSMENT REPORT (AR5), Draft Guidance Notes for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties, available at https://www.ipcc.ch/site/assets/uploads/2018/03/inf09_p32_draft_Guidance_notes_LA_Consistent_Treatment_of_Uncertainties.pdf
A relation between CO2 emissions and CO2 measured in Mauna Loa air was seemingly little affected by the estimated 5% global emission reduction in lockdown year 2020. Ideally, one should be able to correlate other factors with both estimated CO2 emissions and measured CO2 in air. Sadly, one cannot until the 2020 discrepancy is explained.
Another “correlation” needing deeper examination is this one. It is the excuse to adjust station temperature data by data from nearby stations up to several hundred km distant. One suspects a similar high autocorrelation.
Temperature data do not always show a correlation decrease with increasing lags. Here is but one example of Melbourne daily Tmin temperatures for a year, with lags up to 10 days.
Lag, days versus Correlation
I absolutely agree – if you are smearing data over several stations then each reading is no longer an independent piece of data. This fails the preconditions of many statistical methods right out of the gate.
Making up data that does not exist such as they often do for the arctic is another huge problem.
Heat Island Effect is yet another type of correlation contained in the data – as population grows over time the cities grow denser and heat pollution increases in the data. This has nothing to do with CO2 – it is independent but still correlated – they just happen to both be going up.
So between biased conjectures, error margins, bad use of statistics, and many other competing sources for heat it is simply not possible to build a well defined cause-effect between CO2 and temperature data.
Try using well maintained rural stations only and raw data and you might be able to sniff something out.
The US Climate Reference Network—a network of state-of-the-art station located in pristine areas across the lower 48 states and Alaska and Hawaii—shows no warming for more than 10 years.
We can differ.
Temperature profiles are close to being a sine wave. For two stations we have one function as f(t) = sin(t) and the second function is g(t) = sin(t + φ) where “φ” is the phase difference between the two stations. “φ” is related to temp difference due to distance as well as other confounding variables.
When you calculate the correlation of the two functions f(t) and g(t) you wind up the correlation factor being cos(φ). This gives about 90% of the values from 0-2π a correlation coefficient less than 80%. (these are *very* round numbers)
If I remember my notes correctly, a difference of 50 miles or so is enough to drag down the correlation to less than 0.8. A distance of about 10 miles results in a time lag at sunrise of 2 min (round numbers) or about 10 minutes for 50 miles. Kind of hard to impute a temperature from Point A to Point B with this kind of difference.
Temperature at any specific location is more highly correlated with season than with previous daily temps or future daily temps. Local weather and geography has too much impact for there to be much correlation between daily temps at two different locations.
Thank you. That is a useful set of comments. Someone younger and wiser than me really needs to dissect this BEST graph approach. It is used globally to get away with scientific murder. Geoff S
The Big Question is the nature of the correlation between temperature anomaly and the slope of the atmospheric CO2 curve. That slope is the rate of increase giving net CO2 emission from all sources, natural and human-caused.
It is well known by now that the year-to-year changes in that slope — net emission — are large compared to a steady contribution to net emission from our industrial culture. The Wood-for-Trees Web site documents this effect.
NOAA’s go-to carbon-cycle guy Pieter Tans admits as much that there appears to be a large, temperature-correlated, natural contribution to net emission. He claims, however, that the source of this is the decay of leaf litter in the tropical rainforests, that this reservoir is shallow and hence the correlation does not extend to decades. Hence, the multi-decadal upward trend in atmospheric CO2 is almost entirely attributable to human activity.
But what if the natural reservoir emitting CO2 in response to increased temperature is much larger and the correlation occurring over multiple decades? It is then plausible that some non-trivial fraction of the decadal trend in atmospheric CO2 is the result of a natural process and not the fault of humans.
Atmospheric CO2 and global temperature is Not the Correlation You are Looking For — the important quantity is net emission, which is the slope of the CO2 curve.
‘… some non-trivial fraction of the decadal trend in atmospheric CO2 is the result of a natural process and not
the fault ofattributable to humans …’
This seams to be a well reasoned analysis of your question:https://edberry.com/blog/climate/climate-physics/preprint3/
He thinks humans are responsible for about 15% of the CO2 increase.
Additionally Humlum at Climate4you shows the increase starting in the tropical oceans.
It would be interesting to see similar statistical comparisons with the NOAA temperatures that do not contain fabricated data.
It seems as if every time I see a new presentation of the temperature record after 2000 it has been revised to show more of a continuous rise. Not long ago the record showed a.period from 2004 until 2014 where there was no rise in temperature and a temperature in 2020 that was no higher than in 2004.
Erase the pause 😉
Exactly my thought. If using adjusted temperature values I can make it look like anything I wish
Nerd fight 🙂
No. Simply pedos molesting single-digit numbers.
When I see temperature graphs like these I reflect on Mark Steyn’s comment that one day soon we won’t know what the temperature was in 1950, let alone predict it for 2050.
Both natural and man made physical phenomena Cannot be ignored.
Some Natural phenomena must have caused the rise from 1910 to 1940.
Some combination of natural phenomena plus aerosols must have caused the fall from 1940 to 1970.
Sorry, but you can’t play maths games independently of the physical science.
Were aerosols higher from 1940 to 1970 than they are today?
Me – I don’t know
Alarmists – yes and no depending on whether it’s suits their theory of the day. It is my understanding that alarmists required aerosols to be high from the 40s to 70s but after 70s pollution controls kicked in.
Waza and Mike.
I offer the following about global average temperature (GAT).
There is much evidence that GAT has been rising intermittently from the depths of the Little Ice Age for about 300 years.
The estimates of specific values of GAT start at ~1880 and show that most of the subsequent warming occurred before 1940. However, 80% of the greenhouse gas (GHG) emissions from human activities were after 1940. Indeed, the start of a cooling period coincided with the start of the major GHG emissions from human activities.
Advocates of man-made global warming excuse this problem by attributing
(a) almost all the temperature rise before 1940 to be an effect of the Sun,
(b) the cooling from 1940 to 1970 to be an effect of human emissions of aerosols,
(c) the warming after 1970 to be mostly an effect of human emissions of greenhouse gases,
(d) the period of no statistically discernible trend of rise or fall in temperature after ~2000 includes some of the warmest recorded temperatures.
Two points should be noted about these excuses.
Evidence is lacking for the convoluted story to excuse the disagreement of the emissions with the temperature history.
it is obvious that temperatures at the end of a warming period will be the highest of the period.
” it is obvious that temperatures at the end of a warming period will be the highest of the period. “
Surprisingly, you seem to have far less problems with the inverse situation (e.g. the UAH period 2016-now).
The year 2016 should not be used to discern temperature change any more than should the year 1998. Both were years were El Nino periods occurred.
Considering China’s very large number of coal-burning power plants, I expect aerosols to be going up, not down. I doubt that they have the pollution controls that first world laws require.
And what, pray tell, caused the Medieval Warm Period? Certainly no CO2.
The medieval what?
Tony Heller has made the point again and agin, really ad nauseum, that the true correlation is between CO2 and the “corrections” applied to the raw temperature record, and the replacement of measured data at many stations with simulated data. His calculations are open source and easily verifiable. It’s all very well to show that the correlation between the HADCRUT and CO2 concentrations are artificially high, but the real issue is *what is the actual global temperature doing*? Because from where I see the actual measurements, it’s not clear that the global average temperature isn’t going *down* from the standpoint of the last three millennia. It’s time for someone of your stature to address Tony’s point and the entire temperature record over the last 12,000 years from a true “climatic” perspective, rather than staring at the last 30 years of decent satellite data (also manipulated) against a background of wildly manipulated proxy data.
Very interesting point – have to check Heller’s argument out. If he is right, and caution demands one verify carefully before accepting, its completely damning.
Temps went down for 40 years as co2 rose. How much checking do you want?
There is no global average temperature.
Right. But I’ve been told that right now it is ‘warmer’ than it was in the middle of the Little Ice Age. What we need is two numbers of the same unit that present the observation. How would you calculate them?
Why not use the old Koppen climate zone classification, which is based on vegetation? (Sorry, can’t find the umlaut on this keyboard).
Any change in climate would be evidenced by shifts in zonal boundaries such as treelines.
There is no agreed definition for global average temperature (GAT) and if there were an agreed definition then there would be no possibility of a calibration standard for it.
This enables each team that create time series of GAT
(a) to use its own unique definition of GAT
(b) to often change its definition.
The facts of this are explained by this item
especially it Appendix B.
And effects of the changes can be seen with a glance at this
Using an anomaly base period is effectively a method of calibration. As long as anomalies are used there doesn’t need to be an exact, agreed definition.
GAT is mostly used as a measure of change over time. All the surface data sets are in good agreement in respect to the rates of change observed, even though they use different anomaly base periods and, in many cases, different station data.
Since the late 1970s these observations have been supported by satellite measurements of the lower troposphere. Even UAH, which is an outlier on the cool side, shows statistically significant warming over time and its error margins bring it well within the warming ranges reported by the surface data.
“All the surface data sets are in good agreement in respect to the rates of change observed, even though they use different anomaly base periods and, in many cases, different station data.”
And they’re all violating the concept of intensive properties. A temp reading at one location is an intensive property of that location. Averaging that with an intensive property from a different location is physically meaningless. Doesn’t matter whether it’s an “anomaly” or the actual temp, still physically meaningless.
It also ignores the fact that global warming isn’t global. Some places have warmed, some have cooled, some have remained relatively static in the thermometer record. Averaging gives a false impression that something uniformly global is occurring.
Use an anomaly based around the freezing point of water and try and compensate for the huge energy differences required to freeze and unfreeze water. With a huge portion of the Earth’s surface covered in water vapor, liquid water and frozen water, an ‘average’ temperature is a meaningless number.
The seasonal graph of Arctic temperatures shows large non-linearity in temperature even though the energy input varies smoothly over the year. Ideally, the temperature should change sinusoidally, instead it looks like it went through an overloaded electric guitar amplifier.
“Using an anomaly base period is effectively a method of calibration. As long as anomalies are used there doesn’t need to be an exact, agreed definition.”
Nope. The variance of the anomalies is exactly the same as the variance of the absolute temps. Subtracting a scaling transform (i.e. subtracting a constant) from the population data doesn’t change the variance at all. That means the rate of change of the anomalies is also the rate of change of the absolute data as well.
That means that anomalies are no more of a calibration standard than the absolute temps.
At last. I have to notice your development in probability theory. A few weeks back you denied (or completely misunderstood) this otherwise true point. Unfortunately, this doesn’t contradict what TheFinalNail said. You yourself said:
That’s why we don’t need to muck around with kinda problematic concept like the GAT. We have a simple method to calculate something that keeps some interesting properties of the GAT intact. And this is what TheFinalNail referred to.
Other than the fact that anomalies are used to hide the real variance in real temperatures, their use has been used to propagandize the change in temperature.
Several times I have had people pull out an anomaly graph on their phone and try to show me how steep the change is. That is why I keep a a graph that goes from 0 – 20 to illustrate the change.
All of a sudden it doesn’t look so very bad when temps go from 15.000 to 15.012!
Do you realize that your brother/husband/evil-twin/humane-side/sock-puppet (pick the right one) Tom asserted (rightly) the opposite? In the sentence I cited? Talk to him. He could get it right. You can do it too, it’s not that hard.
By the way, the increase in anomalies (that is, the increase in temperatures) is not the variance of temperatures.
No I don’t have it wrong, and in fact I was the one who showed him the variance of a single station is the same whether real temp or anomaly. You’ll notice I used the plural – anomalies. Each single station anomaly has a distribution and variance and when the distributions are combined through averaging, variances should be added. They never are. The combined variances are hidden.
No, the variance of the average is the average of the individual variances, and your persistence to be wrong in this trivial matter is really silly. https://en.wikipedia.org/wiki/Variance#Sum_of_uncorrelated_variables_(Bienaym%C3%A9_formula) Please note that the “uncorrelated” part (a bit weaker precondition than “independent”) refers to the error of the measurement.
Read this again and tell me where you find average.
This is peak idiocy. The sentence you highlighted above IS ABOUT THE FCUKIN SUM, NOT ABOUT THE AVERAGE. Can’t you fcukin comprehend the difference at last? By the way, the variance of the sum (and this is something you “accept”) gives you a standard deviation (ie. its square root) that is sqrt(n)*stdev, so you can get a glimpse of how this is an improvement over the stdev of the original individual variables. n*x has an stdev that is sqrt(n)*stdev(x) (assuming the same distribution). SEE? But I write this sentence down with a fear that you will again misunderstand it in a gross way. As you usually do.
How do you get an average? What’s the numerator?
If you take a sample of a population and calculate the mean, what is the uncertainty of that mean? Is the uncertainty ZERO? Meaning that mean is 100% accurate?
What happens when you combine that with other sample means? Is the uncertainty of all of those sample means equal to zero?
If there’s a station somewhere that takes n samples (with a fixed sampling interval) during a day, the daily average is X_avg = sum(X_i)/n. We can safely assume the X_i measurements are independent, with the same distribution (and the same uncertainty) so we’ve reached your second question:
U(X_avg) = U(x)/sqrt(n), uncertainty under special (but very broad) conditions behave as the stdev of a random variable.
You have to be careful with these combinations. That’s why those guys do those stuff you never understand (like homogenization etc.). You can’t just average a set of averages. And yes, after this, the uncertainty of these measurements decrease similarly.
Come on, get real, never understand homogenization? You think highly of yourself don’t you?
How experienced with measurements and their use? Have you ever measured a crankshaft with a micrometer and had to tell a customer it was too far out of round to be reground and the cost to replace it would be $1000. How about signing your name to the output of a microwave transmitter saying the output was within FCC specs especially when your boss wanted the maximum output to minimize rain fade? Or how about signing a forecast that justifies a 1 million dollar equipment addition hoping you are correct!
That is what you and other CAGW warmists are doing only instead of your own money, you are obligating everyone to pay. You better hope you are correct or more than reputations will be on the line. The government will not go easy on you if your forecasts were done incorrectly or were slipshod. Karma is real.
“U(X_avg) = U(x)/sqrt(n), uncertainty under special (but very broad) conditions behave as the stdev of a random variable.”
Uncertainty is specified as an interval around a value : “Stated value +/- uncertainty”.
You haven’t calculated an uncertainty at all. You haven’t actually calculated anything. The uncertainty of an average is the sum of the uncertainties making up that average. When you divide by sqrt(n) you are dividing by a constant. Constants have no uncertainty.
ẟX_avg = Σ ẟx_i from 1 to n since ẟn = 0.
No. Please read at least a bit about this. At least try, okay? https://en.wikipedia.org/wiki/Measurement_uncertainty
And please check https://en.wikipedia.org/wiki/Propagation_of_uncertainty especially the end of the Linear combinations section.
Ranting and ranting…
You need to read more carefully and learn about the subject you are discussing.
Here is a section of the referenced wiki that is most applicable.
Isn’t it funny how this keeps cropping up with no understanding as to what it means or what it applies to.
Please stop googling and cherry picking without understanding the underlying issue of measurement uncertainty (not measurement errors)!
I have also attached a screenshot from the wiki and you might want to look again at what the summation of variance is for function f. It appears to be a summation doesn’t it?
Most applicable to cases where “where no particular probability distribution seems justified or where one cannot assume that the errors among individual measurements are completely independent”. The whole preceding paragraph (that you elegantly omitted) deals with it. Temperature measurements are not like that. Scraping the bottom of the barrel?
Khm, that’s what you’ve just done 🙂
Why don’t you screenshot your reference describing the “average” of variances.
You and your compatriots like to make assertions but never show any references.
Well here are some pertinent excerpts:
“equal variance” Hmmmm “does not converge” Hmmmm, “average correlation remains constant” Hmmmm.
Have you checked these restrictions against temperature data? I can assure you the variance amongst stations is not equal. This alone would destroy the use of this “rule or law”.
Stop trying to cherry pick things that are square pegs, they won’t fit into round holes.
You and your peeps haven’t even decided if you are dealing with samples or a population. If a database of 2500 stations with data covering one year is one big sample of 2500 data points, then all of this is moot. Find the single sample mean and the standard deviation of the single sample, apply the formula to find the population SD and you’re done.
If you want to declare 2500 samples, then you need to show that the variances of each of the 2500 “variables” have equal variance and that their average correlation has no effect on the variance of the mean.
This statement makes no sense.
I need to point out that a restriction on this is “So if all the variables have the same variance σ2 “. That means that each and every temp station would need the same variance.
The wiki page also goes on to say:
and ends with:
Var(X+Y) = Var(X)+Var(Y)
Funny how that works, right?
What a rant… 🙂
https://en.wikipedia.org/wiki/Variance#Sum_of_uncorrelated_variables_(Bienaym%C3%A9_formula) , second equation. I’ve already referenced this very section a bit further up. In other words, you’re bullshoting again. BTW I don’t understand the relevance of “compatriots” here. Do you think all the normal people are Hungarians?
Again, you don’t understand what you cite. This is not the first time. This is all about correlated variables. We were talking about uncorrelated variables. Measuring the temperature every half an hour results in measurements where the errors are uncorrelated (actually, they are independent). This is a completely reasonable assumption. And I explicitly put this down when I talked about averages. Before you start another round of bs.
I see, you successfully misunderstood again something. I didn’t claim variances were equal. I only said the standard deviation would be stdev(x)/sqrt(n) if the variables were uncorrelated and were of the same distribution. If they were different, no problem, the stdev would be different and not this simple but still sqrt(sum(var(xi))/n). This shows reduction regardless of the fact that the var(xi)-s are different. See? We can average variables with different variances.
Well, the fact that you don’t understand it doesn’t mean it doesn’t make sense 🙂 Yes, var(x+y)=var(x)+var(y). But stdev(x)=sqrt(var(x)). So stdev(x+y) = sqrt(var(x)+var(y)) which is strictly less than sqrt(var(x))+sqrt(var(y)) = stdev(x)+stdev(y) if none of the variances are 0. (sqrt(a^2+b^2) < a + b, just square the two sides. a,b>0) In other words, even the stdev of the sum is relatively less than the stdevs of the single variables. And we don’t even assume the variances are equal.
Did you not read the last post I made with a screenshot? IT WAS DISPLAYING THE PART OF THE WIKI ON UNCORRELATED VARIABLES.
In other words, for uncorrelated variables:
Var(X+Y) = Var X + Var Y
They are not averaged!
Var(X+Y) = Var X + Var Y
They are not averaged!
They aren’t. No one claimed they were averaged. You have an extreme talent to misunderstand everything. Var(x+y) is of course the variance of the sum. var(sum(xi)/n) = sum(var(xi))/n is the formula for the average. I just noticed (with a fear that you would misunderstand that as you actually did) that the standard deviation of the sum, an amount that has the same dimensions as the variables is scaled down with sqrt(n), so it’s relatively lower than that of the individual variables.
Sorry, I did *NOT* say any such thing!
I said the anomalies have the same variance as the underlying data. The variance of the anomalies do *hide* the temperature variance by scaling the actual numbers down.
Data like 15.1, 15.2, 15.3, …..
may have the same variance as .1, .2 .3, …..
but on a graph the smaller numbers LOOK much larger. That’s why it is important to use the actual temperature data. If the variances are the same then why use the anomalies in the first place?
Sorry, but this is not an eyeballing contest. This is science, it doesn’t count that it “looks” larger, and anomalies are not used for that purpose, and this leads us to the next point:
Glad you’ve asked this. Well, this has been explained to you 101010101 times but there may never be enough explanation… So anomalies make different locations and different time periods comparable. A time series of temperature measurements in the Sahara and another one in Ireland are hard to compare. In terms of anomalies, they become quite easily comparable and meaningful. Furthermore, anomalies hide the possible distorting effect of different measurement methods, since change is much-much less sensitive to the method. See? This is this simple. This anomaly-thingy is something so basic I can’t understand why you “skeptics” are whining about this.
It is about communications of what is happening. When a layman and/or politician looks at a graph do you really expect them to interpret it properly. I certainly do not. They see a graph rising in outlandish fashion and think EMERGENCY. It is done to propagandize them. Proper science is to appropriately display results so that percent growth is shows better what is happening.
Yep, that’s why we insist on it should be science that has the last word. You “skeptics” make it much harder.
You are getting lost in the math. You can’t see the forest for the trees.
If CO2 is causing the temp increase you should see CO2 rise precede temp, not rise together. That the rise is so closely link means there something else causing both of them to rise together.
Climate Science of CO2 “isn’t even wrong.”
”That the rise is so closely link[ed]”
Not closely linked.
It might be worthwhile to determine why ENSO drives CO2 increases with a short several month lag. Using annual averages hides so much of what is going on. That is why averages of the globe over a years time makes correlations fuzzy.
Is it not possible that CO2 is increasing because temperature is not the other way round.
I think so. The reason is because the warming is causing a rise in vegetation and it is the rise of broad-leaf vegetation levels that is probably causing the rise in the CO2 level. Mapping from OCO-2 appears to support this but it seems to be ignored. A good example of this is the NASA paper entitled “Satellite Detects Human Contribution To Atmospheric CO2.” The map of the US in that paper clearly shows the coincidence of so-called human caused CO2 and forests not areas of intense human activity.
Statistic analysis of fake data can’t give anything but fake conclusions : GIGO
Andy, nice analysis but you do know that Hadcrut5 data is complete garbage.
Surface temperature from 1850 is a reconstruction based upon proxy data – no accuracy or precision – ice and mud cores crustaceans and tree rings – 95% – surface temp 5% with no coverage of oceans – 71% – forests deserts mountains lakes and poles. And we know that all of this spurious data has been manipulated frequently updated and smoothed to cool the past and warm the future. We know temp has risen and we know Co2 has risen so of course there is correlation but coincidence and correlation are not evidence of causation.
The belief we can trust any of the information used to reconstruct the past is spurios nonsense we cannot. E.ON in the UK are running an advert bellyaching about one glacier in Austria as evidence of climate crisis when glaciers have been melting since the end of the last glaciation. Northern Powergrid owned by Warren Buffet are asking their customers to take part in a survey which is no more than a thinly veiled attempt to exploit all of the propaganda touted by the climate deranged to impose the belief that in order to tackle the climate crisis we have to recognise brownouts and blackouts will become the norm because we have to accept the unreliabity of renewables because otherwise we are all going to die.
“temperatures are rising now faster than at any time in the last 66 million years” Conveniently ignoring that 66 million years ago 76% of all life on earth was extinguished because of the 9 mile wide asteroid which hit the Yucatan peninsula with the explosive force of 10 billion Hiroshima nuclear bombs – BBC The Day the Dinosaurs Died – I remain convinced that most if not all of those involved and paid to be involved in the myth of human induced warming are fully aware they being deliberately deceptive by cherry picking convenient moments in time to authenticate their ideaology the question is why? Why are they so determined to resolve a problem that does not exist by rewriting history and their belied that dream land utopia or current civilisation can survive without fossil fuels.
A lot are true believers,some are too uneducated to understand the implications of the facts expressed here, some are too heavily invested in their position and can’t afford to lose the funding and fame. Some understand the weakness of the actual data record but are using this to drive the political changes they hope to enact.
Loren & DavidW:
Yes. Somewhere I read of “Dupes” & “Knaves”. Dupes of course are the uneducated believers who don’t research the issue and just follow the crowd. The knaves know better but they benefit [fame, power & money] so the process continues.
You see a similar process in the “woke-ism”, which btw may have even a worse effect on society than the “climate change”.hysteria.
This video by Tony Heller explains why Andy is wasting his time analyzing the properties of a manufactured temperature time-series of the earth.
The Climate Of Gavin – YouTube
Zoe Phin has addressed the data in this post What Global Warming? – Zoe’s Insights (phzoe.com)
The fact that climate models use adjusted data for calibration is why they are going off the rails. The divergence is only going to get worse. Mark my word, there will be a concerted effort to dump the UAH data set. My guess is that the Climate Reference Network of high-quality US stations will also be dumped in time.
as i recall Doug Keenan some time ago had a lengthy exploration of similar, using ‘driftless ARIMA’ statistics : https://wattsupwiththat.com/2013/05/31/the-met-office-responds-to-doug-keenans-statistical-significance-issue/
and also discussed here https://quantpalaeo.wordpress.com/2013/05/28/testing-doug-keenans-methods/
to my simple very fusty statistical toolbox the trick is guessing the distribution that one’s sample comes from correctly then using a test to compare samples to determine how likely they are from same/different distributions
here where both are autocorrelated, Keenan’s time series comments might be germane : (or not !)
thanks for this article
here is Doug’s original post on BishopHill – quite an entertaining read http://bishophill.squarespace.com/blog/2013/5/27/met-office-admits-claims-of-significant-temperature-rise-unt.html
Thanks for that article. I may refer to it in the future.
Here’s an excerpt:
“Plainly, then, the Met Office should now publicly withdraw the claim. That is, the Met Office should admit that the warming shown by the global-temperature record since 1880 (or indeed 1850) might be reasonably attributed to natural random variation.”
Now if we can just get NASA Climate and NOAA to admit this. Yeah, i know, I won’t hold my breath waiting.
Trying to generate sense out of nonsense is the pursuit of madness. F1 generates the same inane stupidity. Lewis Hamiltons wing after qualifying was 0.2mm out of spec at one end with two thirds of the wing within specification which the stewards determined justified disqualification. This continuing squabbling about minutiae which itself is the product of estimated numbers pathologically manipulated by statistical inanity to impose a belief which even if could be advanced to theory the resolution of which can only be resolved if according to Christiana Figueres emissions of Co2 are reduced by 50% by 2030. To resolve a crisis which is stated to exist now by exploiting ignorance to impose yet another belief that weather now is worse than in recorded history which the IPCC say cannot be supported by evidence is the product of insanity.
Made worse by the fact that both sides of the debate recognise that every assertion made is based upon conjecture assumption supposition not hard data because when you consult what hard data that does exist the product of that data is miniscule maybe 1C but to resolve this insignificant number civilisation needs to die Tuesday comically those who make these invalid assumptions don’t appear to understand that if their beliefs become reality their lives are extinguished along with the ordinary mortals that their ideology determines need to be constrained otherwise planet will go to hell in a hand cart. This fantastical obsessive paranoia about diddly squat based upon statistical irrelevance to arrive at a meaningless conclusion the solution to which does not exist because there is not one single person or organisation that can prove beyond reasonable doubt or on the balance of probabilities that Co2 more or less causes climate change or a fraction of climate change and when Co2 continued to rise even though emissions from fossil fuels fell by 5.4% because of Covid we don’t even know after decades of fantasising whether emissions from fossil fuels make any measurable difference.
There are to many people on the planet who have too much time on the hands and are willing to spend pointless hours and debate about diddly squat. The planet has warmed and cooled since forever and it will continue to do exactly that and any idea that humanity can influence this reality is barking mad.
My problem with all of this is that it ignores the theory that CO2 is well mixed. If CO2 is the control knob, and even this analysis could lead you to this conclusion, then temperatures everywhere should have a rising curve. On the other hand, a Global Average Temperature, if it is to mean anything means half the globe is below this figure and half above.
I keep asking what is the Standard Deviation of this average and no one ever has one. The conclusion I reach is that this is not a true statistical analysis from the get go. It is mental masturbation trying to find useful answer. One can not discern from the current analysis’s exactly what is happening where. That is not science.
I would be much more interested in seeing the half of the globe where temps are rising in such a fashion. My interest would lie in the 70% of the globe that is covered with water. Are SST’s rising so much they are affecting the total globe or are the rising temps in some other concentrated area(s)? Is the near atmosphere over the oceans warming? This is somewhat like saying the worlds supply of gold is 0.# grams per square meter. It ignores the fact that there are concentrated areas where the amount is #.# x 10^3 grams per square meter and 0 everywhere else.
Serious time series analysis is missing from most analysis. A good twitter friend has been working on this and it is his finding that CO2 concentration follows ENSO with a time lag of several months. All of this has seasonal and monthly changes and annual averages simply ignore this. Much more granularity is needed to discern what is happening.
Not only are the standard deviations unknown, the temperature data series are invariably treated as if they have zero uncertainty. It seems like taking a reasonable amount of u(T) into account should affect the statistics.
“There are lies, damned lies and statistics.” – Mark Twain
Correlation is not causation. Temperature change can cause CO2 change, which has been shown by history over millenia. Co2 is not a cause of temperature change, it is a RESULT of temperature change. Throwing trillions of dollars at Co2 will not only not reduce temperature, it also will not reduce Co2. This whole CO2 thing is the greatest fraud ever foisted off on the public.
Co2 is the new ‘witch’ that must be burned at the stake.
Why can’t it be both? No one disputes that, over centuries, natural cycles can melt sea ice and warm the oceans, releasing CO2 which then acts as positive feedback for further warming. In this case, it’s fair to say that natural warming resulted in increased CO2; but it doesn’t follow that CO2 isn’t an efficient greenhouse gas on its own.
Various simple scientific experiments (example) confirm that CO2 has the ability to reduce the rate of cooling of the atmosphere. So, in the case where CO2 is introduced first, before natural warming releases it from its various sinks, then it seems reasonable to expect it to be the instigator of the warming.
No, but it’s often evidence supporting possible causation, especially when one of the variables has known causative effect.
As I have speculated on here, blackouts and energy poverty are directly correlated to how often Griff posts comment about the accelerating uptake of renewable energy installations.
Of course, what you are omitting is that not only do natural cycles cause warming, but they also cause COOLING !
And irrespective of how warm the climate was (think age of dinosaurs), the climate eventually became cooler (if not an ice age.).
This demonstrates beyond any doubt that there are other factors that determine the climate that have nothing at all to do with CO2.
Basically, CO2 goes along for the ride; it get’s picked up when the climate warms and gets dumped off the side of the road when the climate cools.
A lab experiment – like the one you referred to – does not and can not model ALL the variables that can affect climate; but of course, you know that.
The notion that a trace gas in the atmosphere is the controlling mechanism of the climate is so dishonest that it beggar’s belief.
Any idiot can see that if human activity generated, say, nitrogen, into the atmosphere, the climate zealots (i.e., leftist activists) would be targeting all industry that introduced nitrogen into the atmosphere.
“Any idiot can see that if human activity generated, say, nitrogen, into the atmosphere, the climate zealots (i.e., leftist activists) would be targeting all industry that introduced nitrogen into the atmosphere.”
N2 is inert.
Good job as it comprises 78% of our atmosphere.
It isn’t a GHG.
That reduces LWIR emission to space.
The natural CC has/will always act irrespective of humans.
And yes temperature drives CO2 in the natural CC with its up/down a feedback to warming/cooling, as it is a GHG.
One that doesn’t precipitate out.
And if injected into the atmosphere beyond the Earth’s natural sinks ability to absorb that pulse.
It causes warning.
It just does.
On that the science is settled.
Climate zealots or no.
Point flying past, far over AB’s head.
I will assume you are just being facetious .
But in case you are actually serious, my point was that the entire climate change BS is just that, BS.
It is a LEFTIST POLITICAL movement designed to eviscerate the economies of the advanced Western democracies, in particular that of the USA.
My comment about N2 – that obviously went way over your head – is that if the major industrial gas produced by industry was N2, the climate zealots (leftists, socialists, communists) , would claim that N2 was a pollutant.
I would not be surprised if the entire climate change movement is funded by Russia and China; two nations that do buy into the climate change bullshit.
Yes, China in particular may talk the talk to appease leftist Western “intellectuals” (after all , the Chinese economy depends a great deal upon exports to W.Europe and N.America) , but they are building HUNDREDS of coal fired electric generating plants (as is India).
If the climate zealots were really serious, they would immediately cease the use of natural gas, electricity generated by any fossil fuels (including natural gas powered generating stations), autos, buses, trains, planes etc., and cease wearing and buying anything made from synthetic materials.
I am not holding my breath.
The fact that water vapor precipitates and therefore justifies CO2 being worse is ludicrous. Do you think water vapor is not replenished as it precipitates? If it wasn’t the globes humidity would be pretty low. That ignores the fact that CO2 also has a “precipitation” if you will, it is just called something different, i.e., absorption.
Yes, you’re right. One of the variables has known causative effect. It’s known that rising temperature reduces the solubility of CO2 in water, and that 71% of the earth surface area is water, and that most of the world’s CO2 is in the oceans.
As for CO2 causing warming, that’s a far more speculative idea, given that the effect, if any, is small and easily offset by many other effects, especially cloud cover and emergent phenomena.
“Why can’t it be both? No one disputes that, over centuries, natural cycles can melt sea ice and warm the oceans, releasing CO2 which then acts as positive feedback for further warming.”
That sounds like a recipe for runaway global warming. But we have never had runaway global warming in the past, although CO2 levels have been much higher in the past than they are now.
CO2 mapping suggests that the highest CO2 levels occur where broadleaf vegetation is most prevalent.
” statistical rigor is sorely lacking in the climate community.”
How about rigor altogether.
Rigor mortis maybe
So, let’s assume for the moment that the temperature changes FIRST , and then subsequent to temperature changes, CO2 levels will then change in response to these temperature changes.
How would the auto-correlations change, if at all?
How do the auto-correlations account, if at all, for cause and effect?
The start date of the analysis is about the year 1850; is this not the approximate time the Little Ice Age (LIA ) came to an end?
Why wouldn’t the climate be warming coming out of the LIA?
If this analysis was conducted over the time period 1150 to 1250 – about the last 100 years of the Medieval Warm Period (MWP) – what would the analysis “predict” about the climate subsequent to 1250?
Would the analysis – or any analysis – have predicted the onset of the LIA which commenced in about 1300 or so?
If CO2 determines the climate – as we have been told ad-infinitum – where did all that CO2 come from that “caused” previous (pre-human-influenced) warm climate regimes, and where/how/why did all that CO2 disappear to, that “caused” previous (pre-human-influenced) warm periods to end and which were followed by cold climate regimes?
Are there any time periods in the past – before humans had any influence on the climate – where the auto-correlations shown in the article would have produced almost identical results?
Good job Andy.
Might I suggest that you use this method to see if CO2 is correlated to temperature?
My rather simple try using 1st derivatives, suggests that CO2 follows temperature, with about a 7 month lag.
Again, good job.
CO2 inflow rates change without a lag to temperature. Observing this relationship is the place to start.
My chart shows the best correlation at 7 months lag.
That’s fair and I won’t dispute this. In the context of this article analyzing the residuals of this relationship is where statistical insights can be gained. For example, it makes sense to check for autocorrelations, trends, and other patterns in the residuals of this relationship. Bear in mind, net natural CO2 flux timing most likely relates to surface conditions not mid troposphere.
Agreed. That is why I did it with both UAH and HADCRUT. But UAH performed better. Much to my surprise.
But a 7 month lag gave best results with HADCRUT as well.
ok thanks. The increasing departure with time BP is interesting in HadCRUT. This could be a real effect. Or it could represent an issue with the data if the residuals between UAH and HadCRUT vs CO2 do not follow the same error distribution.
Thank you, Andy, for your excellent work showing CO2 probably has some very small impact on global temperatures as estimated by HadCRUT5.
Andy here is a suggestion. Contact Jim Hamilton at UCSD. Jim is an expert in time series analysis. He is also a nice guy. I haven’t had contact with him since my PhD days at Virginia. His Time Series book is a classic. So much of data analysis in climate lacks rigor. I would really like to see what Jim would say about the monthly data from UAH, Hadcrut5, and CO2 from 79 forward.
I noticed decades ago that the early Global Warmmongers seemed to know no statistics. Alas, the people who joined the field later imported not higher statistical competence but lower standards of morality. In other words crooks were replacing duds.
The adjustments made to HadCrut and GISS have a shockingly high correlation to CO2. Would be fun to have someone put that through a rigorous statistical examination. No reason for those variables to correlate, but they do. Almost perfectly. That seems a forensic smoking gun of malfeasance.
Sorry Mary, I wrote my comment above before noticing your comment. I agree totally. David
I wouldn’t trust the source data sets. NOAA; NASA; HADCRUD; Berkley Earth etc. all are coordinated and aligned. They all include measurement sites that have been contaminated by progressive urbanisation warming the signal. All the datasets have also been processed by algorithms and statistical techniques to ensure a strong correlation. A number of peopled have looked at raw NOAA data measurements and excluded the sites that have had urbanisation related warming; analysis of the raw temperature readings also show a much more noisy signal with periods in the past much warmer than the graphs provided in this post (base on ‘homogenised’ and ‘standardised’ NOAA data would suggest). You will find that these agencies have taken the raw dataset and done curve fitting to the CO2 trend and done Gaussian denoising functions too… so your analysis of R2 on data that has already been adjusted to give an artificial fit doesn’t inform us very much of the underlying climate system or underlying Earth temperature record and its relationship to atmospheric CO2 concentrations.
See work by Tony Heller
Tony Heller also goes through the topics of adjustments to the historical temperature records in quite some depth on this YouTube channel; and verifies the historical climate and weather of bygone eras with the historical newspaper records of events of many decades ago. He shows that the 1920s and 1930s were as warm if not warmer than today with more extreme weather (not indicated by these deceptive graphs from government agencies)
I also recommend you look at paper by Dr Willy Soon et al. who have also removed ‘dirty’ data from the surface dataset historical record and then look for correlations with solar activity or other variables and external forcings.
“See work by Tony Heller”
And then if you are at all sceptical of sceptics (sarc)
Go here, that explains why the man is, err, misguided (Including comments to that effect from our host Mr Watts, Bob Tisdale Steven Mosher and Nick Stokes)
In short anomalies and spatial interpolation are required. You cant just average out the mean of all station’s absolute mean temps available x years ago and progressively forward to today and call that the trend.
You cant and it’s not.
Try being sceptical of sceptics …. especially ones that appeal to your conspiracy ideation.
There isn’t one.
Eg: From Nick Stokes …
“There is a very simple way to show that Goddard’s approach can produce bogus outcomes.”
“SG’s fundamental error is that he takes an average of a set of final readings, and subtracts the average of a different set of raw readings, and says that the difference is adjustment. But it likely isn’t. The final readings may have included more warm stations, or more readings in warmer months. And since seasonal and latitude differences are much greater than adjustments, they are likely to dominate. The result just reflects that the station/months are warmer (or cooler).
That was the reason for his 2014 spike. He subtracted an average raw of what were mostly wintrier readings than final. And so there is an easy way to demonstrate it, Just repeat the same arithmetic using not actual monthly readings, but raw station longterm averages for that month. You find you get much the same result, though there is no information about adjustments, or even weather.
I did that here. What SG adds, in effect, is a multiple of the difference of average of finals with interpolation from average of finals without, and says that shows the effect of adjustment. But you get the same if you do the same arithmetic with longterm averages. All it tells you is whether the interpolated station/months were for warmer seasons/locations.
The increase in vegetation appears to be the cause of the increase in CO2 level. Why does this correlation receive no attention
The problem with anomalies is that they are being misused.
The use of significant digits is totally ignored. The precision of the anomalies far exceeds the precision of the measurements. It is unscientific to portray measurements to higher precision than actually measured. The way they are being portrayed borders on fraud.
The variance of anomalies is not portrayed properly when graphing them against each other. It makes the growth factors look two orders of magnitude greater than temperature changes actually are. For example:
0.1 -> 0.3 [(0.3 – 0.1) / 0.1] • 100 = 200%.
15.1 -> 15.3 [(15.3 – 15.1) / 15.1] • 100 = 1.3%.
All of the variations that you describe such as seasons, latitudes, altitudes, and such are all factors that should be included in a global temperature. This is one reason you can not relate anomalies to any given temperature let alone any regional temperature.
To my way of thinking, you can take an average global real temperature and create anomalies annually if you so wish. This would allow just as accurate a determination of warming or cooling and it would let you easily relate it to a calculated temperature, although regional temps would still be unknowable.
Regional and local temps are the real issue anyway. A global temp fosters the belief that everywhere is warming the same. We know that is not the case. But, how many studies have we seen with the same old statement that global warming is affecting a local condition? You would think scientists would be smart enough to know that an AVERAGE automatically implies some above and some below. That should prompt them to avoid the generalization that their specific region is warming as least as much as the average.
On another thread someone said the variance associated with the GAT is 1.05 °C That makes the standard deviation ± 1.02. who believes in an anomaly of 0.15 ± 1.02 °C? If I was a machinist trying to peddle this I would be laughed out of the room!
I am most definitely in the extreme sceptics camp and do believe that data and findings are politically driven.
I am not convinced that the temperature anomalies in the surface temperature record can be trusted nor that they really tell us anything about local, regional or global temperature trends. I agree with arguments that some sites are prone to problems due to urban heat island effects; time of measurement changes; instrument changes; location of station changes (similar to changing the environment and urbanising around the station)… all impact the values recorded at a given site…. we see that rather than excluding sites with bad or dubious data (ie that have urbanised, moved or material changes in instrumentation used or method of recording temperature) these are used in weighted averages to drown out the few stations that can be trusted in the long-term contiguous record. Furthermore when calculating regional averages this homogenisation over broad areas creates further issues when northerly sites are decommissioned over time and biases towards southerly latitudes start to appear; and as the majority of the recording grid goes through significant change in terms of land use and urbanisation around the few square KM around a site or worse the site itself becomes completely compromised.
Living in South Eastern, England, UK – I can say that I have noticed a general decrease in periods of extreme cold (fewer blocking anticyclones dragging cold winter air from the continent) in the past two or three decades – but the 30 year standard for climate averages is far too short given the multidecadal nature of many natural phenomena governing decades long weather/climate. Further the broader increase in temperature in the Central England trace appears just to be a rebound from the little ice age. As for the slight reduction in overnight lows and extreme cold, that is surely to be celebrated… and brings about fewer cold deaths…
If you take the effort to look for some good sites in the raw NOAA data history of surface temperature recordings you will yourself see and recognise that all the warming is exaggerated and what warming there is could quite easily be natural in origins and is certainly not harmful.
Whilst Steven Goddard or Tony Heller aims his YouTube videos and blogs at a non-technical audience, I feel his work answers the broader questions around is warming happening, is it harmful, unusual or unnatural and unprecedented.
Warming — yes, but very slight, and in line with cyclical patterns going back through instrumental history as well as beyond into human recorded history (news about conditions) and paleoclimate history
Unusual or unprecedented – no extreme weather events had much more catastrophic impact on humans and nature in the past; the climate was warmer at many points in human history and much warmer in the periods when plants and animals evolved. The climate also warmed faster during previous warming events.
Write to your politicians as I do to tell them that this whole Green New Deal/Great Reset/Net Zero nonsense to ban fossil fuels is all about rent-seeking scoundrels and those that want to reshape the economic fortunes and geo-politics of the planet (to the detriment of most people, plants and animals that inhabit it).
Going back to Tony Heller data plots:
After all for the last couple of decades we have the U.S. Climate Reference Network (USCRN) – Quality Controlled Datasets which show little warming too
And if we are going to look at global variations in temperature we should perhaps look to satellite data, which also shows little warming of the troposphere:
I studied and worked as an atmospheric researcher and meteorologist in the 1990s; since the late 1990s I have worked on power-systems engineering; energy policy and market rules and regulation and so have a rather sceptical view of all of what the mainstream claim to be happening, what the politicians claim needs doing and what the rent-seekers and doing to our energy systems and economy in order to get rich themselves.
Is not HadCRUT5 a biased series to use to start with, urban heat island effect etc? Maybe use organic food consumption per capita as dependent variable. Why use just CO2? Can easily do multivariate and put in things like variations in sunlight hitting earth, cloud cover, look at CO2 significance not R2, too much spurious correlation with left out variables.