A Linear Digression

Guest Post by Willis Eschenbach [SEE UPDATE AT END]

In my most recent post, called “Where Is The Top Of The Atmosphere“, I used what is called “Ordinary Least Squares” (OLS) linear regression. This is the standard kind of linear regression that gives you the trend of a variable. For example, here’s the OLS linear regression trend of the CERES surface temperature from March 2000 to February 2021.

Figure 1. OLS regression, temperature (vertical or “Y” axis) versus time (horizontal or “X” axis). Red circles mark the ends of the correct regression trend line.

However, there’s an important caveat about OLS linear regression that I was unaware of. Thanks to a statistics-savvy commenter on my last post, I found out that there is something that must always be considered regarding the use of OLS linear regression.

It only gives the correct answer when there is no error in the data shown on the X-axis.

Now, if you’re looking at some variable on the Y-axis versus time on the X-axis, this isn’t a problem. Although there is usually some uncertainty in the values of a variable such as the global average temperature shown in Figure 1, in general we know the time of the observations quite accurately.

But suppose, using the exact same data, we put time on the Y-axis and the temperature on the X-axis, and use OLS regression to get the trend. Here’s that result.

Figure 2. OLS regression, time (vertical or “Y” axis) versus temperature (horizontal or “X” axis). As in Figure 1, red circles mark the ends of the correct regression trend line.

YIKES! That is way, way wrong. It greatly underestimates the true trend.

Fortunately, there is a solution. It’s called “Deming regression”, and it requires that you know the errors in both the X and Y-axis variables. Here’s Figure 2, with the Deming regression trend line shown in red.

Figure 3. OLS and Deming regression, time (vertical or “Y” axis) versus temperature (horizontal or “X” axis). As in Figure 1, red circles mark the ends of the correct regression trend line.

As you can see, the Deming regression gives the correct answer.

And this can be very important. For example, in my last post, I used OLS regression in a scatterplot comparing top-of-atmosphere (TOA) upwelling longwave (Y-axis) with surface temperature (X-axis). The problem is that both the TOA upwelling LW and the temperature data contain errors. Here’s that plot:

Figure 4. Scatterplot, monthly top-of-atmosphere upwelling longwave (TOA LW) versus surface temperature. The blue line is the incorrect OLS regression trend line.

But that’s not correct, because of the error in the X-axis. Once the commenter pointed out the problem, I replaced it with the correct Deming regression trend line.

Figure 5. Scatterplot, monthly top-of-atmosphere upwelling longwave (TOA LW) versus surface temperature. The yellow line is the correct Deming regression trend line.

And this is quite important. Using the incorrect trend shown by the blue line in Figure 4, I incorrectly calculated the equilibrium climate sensitivity as being 1°C for a doubling of CO2.

But using the correct trend shown by the blue line in Figure 5, I calculate the equilibrium climate sensitivity as being 0.6 °C for a doubling of CO2 … a significant difference.

I do love writing for the web. No matter what subject I pick to write about, I can guarantee that there are people reading my posts who know much more than I do about the subject in question … and as a result, I’m constantly learning new things. It’s the world’s best peer-review.

[UPDATE] My friend Rud said in the comments below:

First, CERES is too short a data set to estimate ECS.

I replied that climate sensitivity depends on the idea that temperature must increase to offset the loss of upwelling TOA LW. What I’ve done is measure the relationship between temperature and TOA LW. I asked him to please present evidence that that relationship has changed over time … because if it has not, why would a longer dataset help us?

Of course, me being me, I then had to go take a look at a longer dataset. NOAA has records of upwelling TOA longwave since 1979, and Berkeley Earth has global gridded temperatures since 1850. So I looked at the period of overlap between the two, which is January 1979 to December 2020. Here’s that graph.

Figure 6. Scatterplot, NOAA monthly top-of-atmosphere upwelling longwave (TOA LW) versus Berkeley Earth surface temperature. The yellow line is the correct Deming regression trend line.

Would you look at that. Instead of using CERES data for the graph, I’ve used two completely different datasets—upwelling TOA longwave from NOAA and global gridded temperature data from Berkeley Earth. And despite that, I get the exact same answer to the nearest tenth of a watt per square meter— 3.0 W/m2 per °C.

My thanks to the commenter who put me on the right path, and my best regards to all,

w.

5 36 votes
Article Rating
Subscribe
Notify of
117 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Pablo
January 9, 2022 2:14 pm

Surface temperature or eye level measuring height?

Nick Schroeder
January 9, 2022 2:19 pm

As the Cheshire Cat observed, if you don’t know where you are going any path will take there.
Correlation is still not cause.

Thomas Gasloli
January 9, 2022 2:19 pm

Are you sure you can just flip the X & Y like that? Doesn’t it say something wrong about the relationship between year & temp?

Michael S Rulle
Reply to  Thomas Gasloli
January 12, 2022 8:28 am

It can be flipped—-but you still will get the same results—somehow the author is confusing the angle difference with the measurement difference—The Deming and OLS give identical results—look at the change in temperature over the 20 year period for each.

Nick Schroeder
January 9, 2022 2:21 pm

Temperature should be expressed only in K because K has thermodynamic substance, C does not.

Willem Post
Reply to  Nick Schroeder
January 9, 2022 4:10 pm

C is just a different name to denote the same “substance”

Tom.1
January 9, 2022 2:24 pm

I cannot tell you how many OLS regression calculations I’ve done, and I’m stunned to learn that I have never heard of Deming regression. I wanted to try it in Excel, but I see that is not in the Excel built in function library. I also do not find any VBA code that I could plug in. I think the explanation here help me to understand it better: Deming regression – Wikipedia

Reply to  Tom.1
January 9, 2022 3:42 pm

Take a look here: https://www.real-statistics.com/free-download/real-statistics-resource-pack/

Free – donation optional. However, I have not used it, so I cannot attest to quality. (Other people I know do use the Real Statistics stuff and say it is good for their needs. Which may or may not include Deming regression.)

bigoilbob
Reply to  writing observer
January 9, 2022 6:42 pm

Deming regression is included, but I have not used it. But Xrealstat is spiffy for OLS trending of weighted data.

Last edited 10 days ago by bigoilbob
Joao Martins
Reply to  writing observer
January 10, 2022 1:37 am

Real Statistics stuff is good. My opinion. I checked the code and used it.

Last edited 9 days ago by Joao Martins
bigoilbob
Reply to  Tom.1
January 9, 2022 6:47 pm

It’s there, in the Xrealstat add in (see below), but I don’t see how you can use it for data points that have changing x and y distributions (or just expected value x’s and distributed y’s) over the data set. I.e. in sea level station data or BEST temp data, both which have changing y distributions with time. I just found it today, and will monkey with it some more…

Randy stubbings
Reply to  Tom.1
January 9, 2022 10:38 pm

Tom (and Willis), OLS calculations invoke several assumptions such as the errors being homoskedastic and having zero conditional means. There are also specific considerations when regressing time series data, such as checking for serial correlation and unit roots. Many books on econometrics cover the assumptions and the consequences when they are violated. OLS is also highly sensitive to outliers. An alternative to OLS is Theil-Sen regression, which has been used for weather-related data.

frankclimate
Reply to  Tom.1
January 10, 2022 7:36 am

Deming Regression in Excel: https://peltiertech.com/deming-regression-utility/ . It’s a free Add in.

bigoilbob
Reply to  frankclimate
January 10, 2022 7:51 am

Folks, what was there to down click in this helpful post by frankclimate? Not rhetorical, what?

DMacKenzie
January 9, 2022 2:31 pm

0.6 C per CO2 doubling! Heads are exploding at IPCC right this moment….

Last edited 10 days ago by DMacKenzie
JCM
Reply to  DMacKenzie
January 9, 2022 3:06 pm

0.6 near tropopause according to this analysis, likely approaching zero at TOA. Increasing tropospheric latent heat flux results in drying the upper atmosphere above the cloud deck. This neutralizes total column IR radiative effects.

JCM
Reply to  Willis Eschenbach
January 9, 2022 3:19 pm

ok I see that now thanks.

TimTheToolMan
Reply to  Willis Eschenbach
January 10, 2022 1:58 am

What is the point of top of troposphere values, then?

Surely the energy the earth is accumulating can only be measured at the true top of atmosphere?

Reply to  DMacKenzie
January 9, 2022 10:24 pm

0.6 C per CO2 doubling!

Heading in the right direction.

Richard M
Reply to  Philip Mulholland.
January 10, 2022 12:46 pm

Yes, this is really a number that requires further analysis since it doesn’t take into account the effect of other variables. For example, the solar TSI declined slightly during this period which could then mean the CO2 sensitivity might be higher. In the same light, cloud effects reduced reflected solar energy. Exactly the opposite effect on sensitivity. The sum of these two effects was a large increase in solar energy reaching the surface.

IOW, the CO2 sensitivity should be quite a bit lower.

In fact, a deeper analysis as provided by Dubal/Vahrenholt 2021 showed that the greenhouse effect (changes in upwelling LWIR) only increased during this period under clear skies. The data does not allow differentiation between GHGs, but this difference makes it pretty clear the changes were almost all due to water vapor.

rbabcock
January 9, 2022 2:36 pm

The one other issue is where you start and where you stop. The climate is a never ending story (at least until the Sun runs out of H2) and the time segments being used to validate arguments are so short that no matter what trends are calculated, all you have to do is wait a while and they will change.

Mike Dubrasich
Reply to  rbabcock
January 9, 2022 7:35 pm

Another issue is the assumption of linearity, especially in a demonstrably nonlinear relationship.

Disputin
Reply to  rbabcock
January 10, 2022 2:57 am

Most of the Sun is plasma, so H, not H2

Pat from kerbob
January 9, 2022 2:42 pm

0.6 is fast approaching zero
Or noise
Or meaninglessness

Ian Magness
Reply to  Pat from kerbob
January 9, 2022 2:55 pm

Exactly (or perhaps a valid approximation taking into account errors) Pat.
I wonder what this process would do to Lord Monkton’s present and previous pause results? Are not the pauses calculated using similar regressions?

Julius Sanks
January 9, 2022 2:51 pm

Willis, that’s all well and good, as was your first essay on the subject. But for those of us who build satellites, I’m not sure how useful it is. I suggest a practical engineering definition. All earth-orbiting spacecraft require thrusters for stationkeeping. But some require thrusters to counter atmospheric drag, and some do not. Satellites on low earth orbit do. I concede it’s still fuzzy. Designers must consider the bird’s drag coefficient on orbit, mass, operational life, and other factors before deciding whether to counter drag. So there is no single or simple answer. But I would suggest the following engineering definition of the top of the atmosphere: “The top of the atmosphere is the point at which the spacecraft does not require thrusters to counter atmospheric drag for the required mission life.”

Devils Tower
January 9, 2022 2:58 pm

Comment for modtran use
Lots of good info modeling atmosphere(static)
Older pdf not behind paywall
You should be interested

https://www.google.com/url?q=http://web.gps.caltech.edu/~vijay/pdf/modrept.pdf&sa=U&ved=2ahUKEwiFhsWe4aX1AhUekokEHUxKBX8QFnoECAkQAg&usg=AOvVaw1VOhKdYYYzaVWRxCgLFdLg

Last edited 10 days ago by Devils Tower
David S
January 9, 2022 3:05 pm

It’s been half a century since my college days and my rusty old brain is creaking like an ancient iron gate. Must be ime for a nap.

M Courtney
January 9, 2022 3:05 pm

Please give the helpful commenter’s handle.
It helps us lurkers know whom has contributed and thus is worth more consideration.

Dudley Horscroft(@dudleyhorscroft)
Reply to  M Courtney
January 9, 2022 3:16 pm

Those scatter plots look rather like the relation ship is centered about a single point, with variable errors, with a few outlying extreme values. Could it be that there is no relationship, and the extreme points which create the apparent relation ship are just major errors and should be discounted?

Reply to  M Courtney
January 9, 2022 10:42 pm

Please give the helpful commenter’s handle.

His handle is Greg, his comment is here

Click on the link that says “commenter” Greg, it will take you to his comment.

Willis. Why is is so hard for you to acknowledge your sources by name?

January 9, 2022 3:21 pm

If only paid climate scientist and their followers fessed up like this. Just think what it would be like if they retracted papers as soon as they were demonstrated to be false!!!

Willis is very good.

January 9, 2022 3:22 pm

Just a probable typo…1st sentence under fig. 2:

“YIKES! That is way, way wrong. It greatly underestimates the true trend.”

Surely it over estimates, not underestimates the true trend.

Fig. 1 0.4C difference
Fig 2 0.9 C difference

Or am I missing something?

Reply to  Willis Eschenbach
January 9, 2022 5:50 pm

OK, so I WAS missing something. Thanks for the clarification.

January 9, 2022 3:29 pm

Hi Willis,

What you call Deming regression is a special case of the more general case of having not only errors in the X and Y values, but these errors are correlated to each other. The more general case was addressed by the late great Derek York (my PhD advisor) in:

York, Derek, 1969, Least-squares fitting of a straight line with correlated errors: Earth Planet Sci. Lett., v. 5, p. 320-324. 

This paper was later improved by Mahon, who corrected a minor error estimate value:

Keith I. Mahon (1996) The New “York” Regression: Application of an Improved Statistical Method to Geochemistry, International Geology Review, 38:4, 293-303, DOI: 10.1080/00206819709465336

This is an important issue in isotope geochemistry because when one regresses one isotope ratio to another, for example when fitting an “isochron”, the X and Y errors are frequently correlated. This is also an issue when either the X or Y value is a derived quantity where some part of the value is involved in the calculation for both the X and Y values. If the correlation coefficient for the X and Y values is zero, this is what you refer to as Deming regression.

January 9, 2022 3:42 pm

I’ve gone down this rabbit hole before. If time is the x-axis variable, then OLS regression gives the right answer, doesn’t it? Or am i missing something?

TimTheToolMan
Reply to  Willis Eschenbach
January 10, 2022 2:00 am

You are correct, because the errors in the time measurement are basically zero.

I doubt that’s true for proxy measurements? Its rare for the age of a proxy to be precisely known.

bigoilbob
Reply to  Roy W Spencer
January 9, 2022 4:26 pm

Subject adjacent, and an opportunity to ask the expert. I’m sure it’s stat101 simple, but I’ve never seen a derivation of how uncorrelated, distributed y values (temp, sea level, etc.) v time, increase the standard error of the resulting trend. I’ve looked, repeatedly. I have an intuitive idea of how it works, and have gotten close agreement with distributed BEST and sea level station data, using my idea and comparing it to brute force approximations (as in hundreds of thousands of excel rand functions). But I might still be off in my thinking and/or derivation, and would appreciate knowing how this evaluation is done properly..

Thanks in advance….

Last edited 10 days ago by bigoilbob
bigoilbob
Reply to  Willis Eschenbach
January 10, 2022 6:19 am

Thank you Willis. This I knew. What is interesting me is how that standard error increases when some or all of the y values are uncorrelated distributions. As in BEST temp and sea level station data.

I will try and peruse a reference from another reply.

Thanks again.

Stephen Lindsay-Yule
January 9, 2022 3:54 pm

Top of atmosphere is 240 watts and radiates to the earth according to Willis Eschenbach. Not earth emits 340 watts and depletes with height.by 240 watts. Because molecules are in constant motion and therefore emitting energy. Molecules further from the earth are at slower velocities and therefore have less energy(lapse rate). As do molecules in higher latitudes have less energy as ground is colder. Every 103.41 hPa decrease, atmosphere is 6.5°C cooler for 10km.Willis sticks to radiative cooling and top down warming. Found in climate models. Molecules do not stop moving, reason top of atmosphere is at slowest and much less matter at 100 watts. Solar heating at most sun absorbed pole at 10hPa is why stratosphere is so warm. Battling with strong radiative cooling from carbon dioxide. Heights where high energy and strong cooling maintains the average at 100 watts.

Stephen Lindsay-Yule
Reply to  Willis Eschenbach
January 10, 2022 12:05 am

Your unaware the earth emits average 340 watts(yet to discover 390 watts is incorrect).
Your unaware 70hpa temperature range -50° to -80°C.(average 100 watts)
Unaware temperature is directly proportional to radiation. 100 watts is around -66°C
Your unaware 99.9% of the atmosphere expands with ascending height and compresses with descending height. And this increases (descending) and decreases(ascending) temperature. Change in temperature means change in energy(velocity).
Unaware solar irradiance heats surface above what earth emits. Heats(shorter waves than earths longwave radiation) trace gases at the top of atmosphere (100 watts)
Earths emitted energy is transparent to space. Otherwise we would burn and die.
There are two poles which energy(added together) is subtracted to energy at the equator. (460/2 – (460-230) 230(NP) – (460-230) 230(SP)= 0).
Your article is academic and only fits with imagined climate modeling and not real world observations.

Harkle Pharkle
Reply to  Stephen Lindsay-Yule
January 10, 2022 4:05 am

This is gibberish

Stephen Lindsay-Yule
Reply to  Harkle Pharkle
January 10, 2022 5:52 am

If you don’t do the work, your mind will reject what you read. i have over a years work and convinced on what the observations tell me.

06112021GMT.png
Harkle Pharkle
Reply to  Stephen Lindsay-Yule
January 10, 2022 7:12 pm

That may be true but you need to use actual English sentences with approximately correct grammar if you want to communicate your thoughts to this audience.

Willem Post
January 9, 2022 4:08 pm

Willis, If you as an experienced veteran did not know, I am willing to bet very few others also did not know, say 99%?

What about all the graphs out there constructed by THOSE folks?

Kip Hansen(@kiphansen2)
Editor
January 9, 2022 4:21 pm

Too much math, too many statistics, not enough thought.

Anyone interested in the overall effect of increasing CO2 in the atmosphere should refer themselves to someone who understands the underlying topic physics — maybe Will Happer?

Rud Istvan
Reply to  Kip Hansen
January 9, 2022 4:32 pm

I made a comment which disappeared. Disagree with the WE result for three reasons. In sum:
First, CERES is too short a data set to estimate ECS.
Second, there are three ‘observational’ ways to derive ECS: energy budget, Lindzen Bode feedback observationally corrected from IPCC, and Monckton’s equation. All three converge about 1.7C (as does Callendar’s 1938 curve).
Third, the zero feedback can be calculated reliably between 1.1-1.2C (Monckton’s equation produces 1.16C. So anything below that no feedback value including feedbacks means a significant negative feedback—very unlikely given absolute humidity averages about 2% so WVF MUST be significantly positive.

PCman999
Reply to  Rud Istvan
January 9, 2022 4:51 pm

But if you consider the huge increase in CO2 over the satellite era and compare that to the flat temp response once the El Nino step- ups in 1998 and 2016 have been removed, it seems temps are ignoring CO2.

Rud Istvan
Reply to  PCman999
January 9, 2022 5:10 pm

Trends are too short to draw such a conclusion. There are clear superimposed natural cycles on order of 60 and a few hundred years.

TimTheToolMan
Reply to  Willis Eschenbach
January 10, 2022 2:15 am

To me, the whole idea that there are significant positive feedbacks is a non-starter because of the stability of the system …

I dont understand where this “community certainty” has come from either. Habbit I guess.

Much like the community certainty that AGW must be worse for us without so much the briefest consideration it might actually be better.

Kip Hansen(@kiphansen2)
Editor
Reply to  Willis Eschenbach
January 10, 2022 8:04 am

w. ==> What Rud said…..that’s the “thought” side. All of scientific investigations must begin with lots and lots of thought and then one may appropriately apply maths to the right sets of carefully selected data.

Starting out with maths is almost always backwards.

I know we disagree about this — have for years.,

TimTheToolMan
Reply to  Kip Hansen
January 10, 2022 12:45 pm

I think some of the most inciteful results come from simple comparisons well expressed. This goes to the core idea in AGW that the surface must warm to restore the balance.

Clyde Spencer
Reply to  TimTheToolMan
January 10, 2022 3:43 pm

I think that a good example is Einstein’s thought experiments, before he summarized them with succinct mathematical expressions.

Julian Flood
January 9, 2022 4:38 pm

Willis, does this explain why the Sea of Marmara seems to be warming at a rate of circa 0.5K per decade?

JF
(I know, I know, but I wish someone would come up with a reason. Could it be anything to do with sea snot?)

Geoff Sherrington
January 9, 2022 6:07 pm

Willis, Your essay is very neat, thank you.
Back in the 1960s there were no computers, but we did use statistics, often by pencil and paper AND eraser. Like, I did Analysis of Variance by Fisher’s original equations, dozens of times, looking at levels and types of fertilizers affecting plant yields in factorial experiments.. I mention this because manual work causes focus on individual numbers (as opposed to the gaint Hoovers that suck them into impartial computers). Because one often saw a number that might be suspicious, errors and uncertainties were more front of mind than seems to be the case in modern day work.
You might have seen me going on about errors in posts at WUWT over the decade. The short conclusion is that there remains a lot of ignorance among climate change researchers about error and uncertainty. (Just ask Pat Frank). I have seen a few excellent papers with proper error treatment, but they are rare as hens’ teeth or proper apostrophe use.
So, when you discuss the need for knowledge about errors in both X and Y parameters for a valid regression, one has to assume that the magnitude of the errors is both known and correct.
Particularly woith measurements of incoming and outgoing flux at top of atmosphere, with the important tiny difference between these 2 large numbers, eoor determination is critical. Sadly, given the way that the numbers are obtained, without much scope to tightly replucate measurements from satellites, the customary couple of Watt per square metre numbers are more likely to show creative accounting than repeatable, valid errors. Different satellites gave gross numbers around the 1350 units, but some were 15 units of difference between satellites that were adjusted out, resulting in people now quoting differences of small values like 0.1 W/m^2. Sorry, but this does not seem justified. So, in a sense, it is back to the drawing board with your regression story, so that a valid error can be used on both axes.
All the best Geoff S
http://www.geoffstuff.com/toa_problem.jpg

Jim Gorman
Reply to  Willis Eschenbach
January 10, 2022 11:41 am

Willis,

I must disagree with your characterizing Standard Error of the Mean (SEM)as the “getting to the error”.

Although Wikipedia is not always accurate, it has a good description of Standard Error (also SEM).

https://en.wikipedia.org/wiki/Standard_error

The standard error (SE)[1] of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution[2] or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error of the mean (SEM).[1]

The sampling distribution of a mean is generated by repeated sampling from the same population and recording of the sample means obtained. This forms a distribution of different means, and this distribution has its own mean and variance. Mathematically, the variance of the sampling distribution obtained is equal to the variance of the population divided by the sample size. This is because as the sample size increases, sample means cluster more closely around the population mean.

The upshot is that SEM (or SE) is the description of how well a sample means distribution with a given sample size captures the correct value of a population mean. It is only useful if you have done an adequate job of sampling the population. It is not an “error” of the real population mean, that is best shown as the parameter known as the Standard Deviation of the population.

Again, from Wiki:

Therefore, the relationship between the standard error of the mean and the standard deviation is such that, for a given sample size, the standard error of the mean equals the standard deviation divided by the square root of the sample size.[1] In other words, the standard error of the mean is a measure of the dispersion of sample means around the population mean.

I can’t emphasize enough that if you declare an entire 5000 or 10000 stations as a single sample (i.e. n = 5000 or 10000), then the standard deviation of that entire distribution is the SEM. As a sample statistic, the proper way to determine the population Standard Deviation is to multiply the SEM by (sqrt n).

To summarize, for one to know how well the mean represents variation in the values of the data, one must use a statistical parameter such as standard deviation. The SEM is not a statistical parameter, it is a simple statistic of a sample distribution that can be used to obtain an estimate of a population standard deviation.

Here are some links to help explain.

https://www.scribbr.com/statistics/standard-error/

https://www.investopedia.com/ask/answers/042415/what-difference-between-standard-error-means-and-standard-deviation.asp

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1255808/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2959222/#

Lastly, a way too many scientists have gone off the rails with these statistical calculations. Even the NIH acknowledges this in their documents I have referenced.

You many estimate a population mean very accurately, but the SEM is not a parameter that properly defines how well the mean categorizes the entirety of the data. Only a properly calculated Standard Deviation will do that.

Jim Gorman
Reply to  Willis Eschenbach
January 10, 2022 5:51 pm

Sorry I should have copied and pasted instead of typing off my memory. The phrase was “two major sources of error in getting to the data”.

John Entwistle
Reply to  Geoff Sherrington
January 10, 2022 4:14 am

“manual work causes focus on individual numbers (as opposed to the gaint Hoovers that suck them into impartial computers)”

That’s why I am so suspicious of grid averages and monthly averages of temperatures. I’ve spent hours scrolling through daily temperature records for dozens of different locations around the US. There are a lot of errors and a lot of missing data points in the records, never mind the observation and instrument errors that must accompany older observations. You can’t just average this up and make the errors go away.

bigoilbob
Reply to  Geoff Sherrington
January 10, 2022 9:16 am

I have seen a few excellent papers with proper error treatment, but they are rare as hens’ teeth or proper apostrophe use.”

Links?

Kevin kilty
January 9, 2022 6:28 pm

In his 1964 book “The Statistical Analysis of Experimental Data” John Mandel said, “The proper method for fitting a straight line to a set of data depends on what is known or assumed about the errors affecting the two variables.” He broke the topic into four cases:

1) classical OLS where x is known exactly
2) Errors in both variables, Demings generalized use of least squares
3) The case of controlled errors in the x variable where we only “aim” at particular measurement values (Berkson, Are there two regressions? J. Am. Stat. Assoc. 45, 164-180, 1950
4) Cumulative errors in the x variable.

What appears simple always has a bit more to it.

Last edited 10 days ago by Kevin kilty
bigoilbob
Reply to  Kevin kilty
January 9, 2022 6:59 pm

Thx Kevin, I’ll hunt for it. 1b or 2 might be an explain of how to analyze single point x data but uncorrelated, distributed y data. Per my question for Dr. Spencer.

Last edited 10 days ago by bigoilbob
Keith Minto
Reply to  Kevin kilty
January 9, 2022 7:10 pm

It would be interesting to see a compilation of “eyeball” straight line trends and see if there is some agreement on the trend line.

January 9, 2022 7:12 pm

Quick question. What were the input X and Y errors used for your corrected regression? You should have also been given as output error estimates for both the slope and intercept. For your purpose, only the slope error estimate is important.

Reply to  Willis Eschenbach
January 10, 2022 12:47 pm

Thanks. That appears to address the input errors for the temperatures. It’s probably the best you can do with what you’ve got. Do you assume the TOA power values have a uniform error? If so, what would that error estimate be? If you have a complete set of input errors, you should have a goodness of fit parameter such as a chi-squared value that results from the fit. Are your slope error estimates purely from a priori errors, or are they adjusted to include scatter about the line (a posteriori). That is usually done by multiplying the a priori error estimate by the square root of the chi-squared variable divided by n-2 (i.e., the number of degrees of freedom). This “adjusts” the input errors to the point where the chi-squared would be n-2. Sorry for the questions, but these are standard issues in the isotope geochemistry biz.

What a lot of people do not realize is that a correct approach to linear regression is in fact a rather knotty bit of non-linear inverse theory!

bdgwx
January 9, 2022 7:45 pm

Assuming the Myhre 1998 value of 3.7 W/m2 and the WE value of 0.6 C per 2xCO2 is correct that would be a climate sensitivity of 0.6 C / 3.7 W/m2 = 0.16 C/W.m2. The 1 C warming that has occurred would have required 6.2 W/m2 plus the 0.8 W/m2 of EEI for a total of 7.0 W/m2. And given the -1 W/m2 of aerosol forcing that means we need 8.0 W/m2 of positive forcing. Where did the +8.0 W/m2 forcing come from?

Last edited 10 days ago by bdgwx
bdgwx
Reply to  Willis Eschenbach
January 10, 2022 7:06 am

It is my understand that the 0.6 C per 2xCO2 ECS figure is agnostic of that detail. So if the tropopause forcing is +3.7 W/m2 per 2xCO2 then the climate sensitivity is 0.6 C / 3.7 W/m2 = 0.16 C per W/m2 in terms of tropopause forcing. You can plug in any value of W/m2 and the sensitivity would then be in terms of that value whether it be tropopause, TOA, or any other reference height. I just happen to choose the 3.7 W/m2 tropopause value provided by Myhre 1998 so the 0.16 C per W/m2 sensitivity is thus in reference to the tropopause forcing.

Pat Frank
January 9, 2022 7:51 pm

What error (uncertainty) did you use for the surface temperature, Willis? 🙂

Reply to  Willis Eschenbach
January 9, 2022 8:37 pm

These results are close to Kaplan’s 1960 paper on the subject. He refuted Plass’s modeled CO2 numbers as being 2-3x too high. Plass is the basis of all of Gavin Schmidt’s work.

Rick C
Reply to  Willis Eschenbach
January 10, 2022 8:33 am

Willis: Very interesting analysis. I’m ashamed to say that I have a degree in math with and emphasis on statistics and yet was not familiar with the Deming regression. In my defense, I did look through my college textbooks and found no mention of it. Of course that was long ago and perhaps before Deming was recognized for is brilliance by US professors. I did read Deming and Juran later in life, but I don’t recall seeing this discussed. Anyway, I did often analyze correlation data by regressing both X vs Y and Y vs X and when the two results agreed concluded that more confidence was warranted. Disagreement to me meant the data was suspect.

One question though. How does the R-squared value or the correlation coefficient compare between the OLS and Deming regressions?

Thank you for an illuminating discussion. We all can still learn a lot.

Last edited 9 days ago by Rick C
Izaak Walton
January 9, 2022 8:21 pm

According to the Stefan-Boltzmann law a 1 degree increase in temperature from 287K to 288K would result in a 5.4 W/m^2 increase in radiation. The fact that Willis got a value of
3W/m^2 means that he is effectively calculating the emissivity of the earth using two different data sets and it should be no surprise that he got the same answer each time.

DMacKenzie
Reply to  Izaak Walton
January 10, 2022 9:22 am

Emitted by the ground at 288 K is 390 W/m^2 IR, while TOA is 240 W/m^2… the difference being what most people refer to the “green house effect”….Willis’ calcs are TOA…

meab
January 9, 2022 8:41 pm

Willis: A commenter pointed out Deming regression to you June 13, 2012 in your post entitled “Observations on TOA Forcing vs Temperature”. Deja vu all over again? Here’s a bit of his comment:
Willis: The slope of the linear regression of Y on X is simply the reciprocal of the slope of the linear regression of X on Y.

Commenter:

that’s seldom true (that is, the probability of it being true is 0.) If you perform the linear regression of y on x and call the slope estimate b(Y|X), and do the regression of x on y and call the slope estimate b(X|Y), then b(Y|X) =/= 1/b(X|Y).
You can look this up on Wikipedia under the topic “Deming Regression”. I have found Wikipedia entries on statistical topics to be quite good. You can also look it up on Mathematica.

I’ve left comments regarding Deming Regression in a different context before. Here’s a comment I left on an analysis (not yours) that just threw out data because it was higher error than other data:

meab:

It’s not proper to eliminate data because of perceived high error. You can weight the data, but if you eliminate it you’ve biased the result. By doing this, they booted their analysis. Look up the Deming method for properly weighting data by its associated error. And yes, it’s the same Deming who radically improved Japanese production quality.

One of my degrees is in Mathematics with an emphasis on statistics – an area in which many alarmist analyses are demonstrably lacking.

mkelly
Reply to  meab
January 10, 2022 7:48 am

Meab, thanks for mentioning that this was from The Deming. He singlehandedly saved Japanese industry which forced the American auto industry to eventually use his SPC methods for quality control.

January 9, 2022 11:23 pm

The one good recommendation from the climatgate enquiries was for a person skilled in stats to be attached to each climate research team.

I understand this has not been implemented.

TimTheToolMan
January 10, 2022 1:46 am

But using the correct trend shown by the blue line in Figure 5, I calculate the equilibrium climate sensitivity as being 0.6 °C for a doubling of CO2 … a significant difference.

From AR6

https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_SPM_final.pdf

A.4.4 The equilibrium climate sensitivity is an important quantity used to estimate how the climate responds to radiative forcing. Based on multiple lines of evidence,21 the very likely range of equilibrium climate sensitivity is between 2°C (high confidence) and 5°C (medium confidence). The AR6 assessed best estimate is 3°C with a likely range of 2.5°C to 4°C (high confidence), compared to 1.5°C to 4.5°C in AR5, which did not provide a best estimate. {7.4, 7.5, TS.3.2}

My instinct tells me the ECS is less than the radiative forcing alone (ie less than 1.1C) because CO2 is passive and feedbacks will be overall negative in the long term.

There is a hilarious list of contributing factors “factors du jour” in the IPCC report.

January 10, 2022 3:35 am

Danny Braswell and I demonstrated the importance of this issue in our 2011 Remote Sensing paper, where we showed that a mixture of radiative forcing (mostly uncorrelated with temperature because of the climate system’s heat capacity) and radiative feedback (highly correlated with temperature), leads to scatter plots of temperature versus radiation that produce OLS regression estimates of the feedback parameter which are too low (biased toward high climate sensitivity). We further harped on this issue in the following years in other papers. The Remote Sensing paper is the one where the editor resigned the following day following publication after Trenberth criticized him for allowing publication. https://www.mdpi.com/2072-4292/3/8/1603

TimTheToolMan
Reply to  Roy W Spencer
January 10, 2022 4:56 am

Yeah, it was pretty sad how he could harp on about how your simplified model was too simple (eg no ENSO) when the simulations of ENSO at the time were biased and poor. Apparently its better to take into account an effect that’s modelled badly than to simplify it.

Tom.1
January 10, 2022 5:24 am

In the discussion of errors here, it is not entirely clear what kind of errors we are talking about. Regression (curve fitting) is a process of reducing the “error” between some data and a mathematical model which is intended to show a relationship between some independent variable or variables and a dependent variable. The data may have errors due to the fact that we simply lack the ability to measure or know exactly what the value of something is. The curve fitting error can result from the fact that we simply have not accounted for all the independent variables, or it can be due to the fact that there are errors in the determination of the values of both the dependent and independent variables. Can someone talk about this please.

HenryP
January 10, 2022 10:22 am

Hallo Willis

I finally figured it out. The “global’ warming…
https://breadonthewater.co.za/2022/01/10/global-warming-due-to-ehhh-global-greening/

Let me know what you think.

January 10, 2022 12:19 pm

Linear regression only works if ALL of the variables are taken into account and ALL of the regression equations solved simultaneously. This includes the regression of the dependent variable on its own previous values. The ARX method, for “auto-regression with exogenous variable”, is described here: https://blackjay.net.au/measuring-climate-change/ . It gives both Impulse Response and Sensitivity.

See - owe to Rich
January 10, 2022 2:39 pm

Willis,

The paper Ramanathan, V., and A. Inamdar, 2006: The radiative forcing due to clouds and water vapor. In J. Kiehl & V. Ramanathan (Eds.), Frontiers of Climate Modeling (pp. 119-151). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511535857.006 gives a value for your quantity which is similar, but not the same to 1 decimal place, and is:

3.53 Wm-2K-1

I am writing a paper on climate feedback which uses this value.

HTH, Rich.

Old Grumpus
January 11, 2022 2:03 am

A lot of scatter.
What are the 1sd error bars on the 0.6C per doubling?

Geoff Sherrington
January 11, 2022 2:45 am

Willis,
Please pardon the length and lateness of this. Geoff S
……………………
A crucial, but neglected duty of the senior scientist of today is to examine the “perceived wisdom” to see if it is really “deliberate deceit”; to publicise examples of the latter, to try to change guesses into reliable, derived values that can be reproduced.
Willis, you wrote above “… the standard error of the mean of the 64,800 individual gridcells that are averaged to give each month’s value. For the temperature, this is about 0.06°C.
The question arises, “Is this the appropriate error to use to support the work?” You used the Ceres satellite measurement of surface temperature as well as the Berkeley Earth temperature product.
In the case of measurement of water temperature, one can draw on experience of measurement performance by experts under top, controlled conditions. A few years ago I asked the Brits at their National Physics Laboratory how good they were at measuring water temperatures.
My question was –
Does NPL have a publication that gives numbers in degrees for the accuracy and precision for the temperature measurement of quite pure water under NPL controlled laboratory conditions?
At how many degrees of total error would NPL consider improvement impossible with present state-of-art equipment?”
Part of their answer was –
“NPL has a water bath in which the temperature is controlled to ~0.001 °C, and our measurement capability for calibrations in the bath in the range up to 100 °C is 0.005 °C. However, measurement precision is significantly better than this. The limit of what is technically possible would depend on the circumstances and what exactly is wanted.”
Australia’s National Measurement Institute answered “The selection of a suitable temperature sensor  and its readout is mostly based on the overall uncertainty, the physical constraint (contact/immersion), manual or auto-logging, available budget… The most accurate (most expensive) sensor is a standard platinum resistance thermometer at mK level uncertainty.” (A mK is 0.001 Kelvin).
……………………….
Moving from optimised specialist laboratories to the real world of ocean T measurement by buoys, we see claims like this – ” The temperatures in the Argo profiles are accurate to ± 0.002°C
https://argo.ucsd.edu/data/data-faq/#accurate
This claim is laughable. It is an example of “deliberate deceit” to claim that they can do as well as the top measurement labs under laboratory conditions.
Next. we have the Ceres claim of the standard error of the mean of about 0.06 deg C. However, for your application, Willis, you do not need the standard error of the means of the gridcells – you need the absolute error involved over the whole measurement process.
Finally to Berkeley Earth. I searched the Net for about 20 minutes looking for a mathematical error figure for ocean temperatures and managed to find one. In the following reference, we have “Figure 2. Component uncertainties for the ocean average of HadSST v3 and the corresponding transformed forms of those components after the application of the interpolation scheme described in the text. All uncertainties are expressed as appropriate for 95 % confidence intervals on annual ocean averages.” Their graph of total uncertainty varies over the decades from about 0.05 to 0.2 deg C.
https://essd.copernicus.org/articles/12/3469/2020/essd-12-3469-2020.pdf
……………………………

In summary, many modern authors are wary about quoting any numbers for accuracy, error or uncertainty of ocean temperature measurements. Most do not even know the difference between these 3 terms. There is a preference to carry on as if these are matters, like the content of sausages, about which Mark Twain said with the law in mind “People who love sausage and respect the law should never watch either one being made.”

Pat Smith
January 11, 2022 8:25 am

Willis, you usually say ECS but don’t you really mean TCR? The immediate change rather than the one that takes place years or centuries later? If you’ve answered this a hundred times before, please ignore!

bdgwx
Reply to  Willis Eschenbach
January 11, 2022 12:58 pm

It takes awhile for the surface temperature to respond to an energy imbalance though. There’s currently a +0.8 W/m2 imbalance which will take at least a couple of decades to increase the temperature enough to restore a balance even if the forcing that caused imbalance drops to zero.

lgl
Reply to  Willis Eschenbach
January 11, 2022 1:36 pm

No, ECS is after the oceans have warmed, so you are not dealing with ECS.

Last edited 8 days ago by lgl
lgl
Reply to  Willis Eschenbach
January 15, 2022 5:22 am

No, you are missing the point. ECS is not defined that way. You have invented your own ECS, the Eschenbach Climate Sensitivity.

lgl
Reply to  Willis Eschenbach
January 16, 2022 1:01 am

But the effect isn’t immediate. If so you could have calculated the sensitivity from the daily solar input variation. And it’s not just the oceans. It takes time to warm the atmosphere too, where most of the radiation originates, 2-3 months judging by the surface-UAH/RSS lag.

https://en.wikipedia.org/wiki/Climate_sensitivity

“It is a prediction of the new global mean near-surface air temperature once the CO2 concentration has stopped increasing, and most of the feedbacks have had time to have their full effect”

Why didn’t you calculate the sensitivity using the post-1979 data?

Richard M
January 11, 2022 7:23 pm

Interesting that this number is very similar to this sensitivity calculation.

https://www.academia.edu/39277492/Challenging_the_Greenhouse_Effect_Specification_and_the_Climate_Sensitivity_of_the_IPCC?email_work_card=title

“According to the two methods of this study, the climate sensitivity parameter λ is 0.27 K/(Wm-2 ). It is about half of the λ value 0.5 K/(Wm-2) applied by the IPCC and the reason is in water feedback. Based on these two findings, the TCS is only 0.6°C.”

Richard M
January 11, 2022 8:02 pm

I believe what is actually being calculated here is the amount of energy required to raise the temperature within our atmosphere. It has nothing to do with CO2. Downwelling radiation is an irrelevancy. Temperature change is based on energy in – energy out.

The only way to change the the temperature is to increase energy in or decrease energy out. While it may look like blocking some outgoing radiation could reduce energy out, all it does is activate other energy transport mechanisms within the atmosphere. The same amount of energy still gets radiated to space.

What really happens with increased CO2 is you get a little bit more activity within the atmosphere which helps spread out the energy a little more evenly. This will warm colder areas and cool warmer areas.

So, why have we warmed? More energy is coming in due to a reduction in cloud reflectivity.

lgl
Reply to  Richard M
January 12, 2022 7:23 am

Most of the radiation to space comes from the atmosphere,comment image
The temperature of the atmosphere or surface or both will have to increase to restore the balance, and it’s of course both. The surface is warmer than the atmosphere so there is no reason to believe the temperature of the surface will increase less than the temperature of the atmosphere.

Michael S Rulle
January 12, 2022 8:21 am

???—The change in temperature in the two regressions (Deming and OLS) are identical—–the angle of the two regression lines are of course different as the dependent variable and independent variable were switched—-but the two are exactly the same—-it is impossible for them not to be the same.

lgl
Reply to  Michael S Rulle
January 12, 2022 9:42 am

I just checked this. RSS data into excel and added trends. The ‘normal’ way the trend is 0.021 K/yr. Switching the variables the trend becomes 34.47 yr/K. 1/34.47=0.031 K/yr, not 0.021 so Willis is right.

%d bloggers like this: