Critique of the new Santer et al. (2019) paper

From Dr Judith Curry’s Climate Etc

Posted on March 1, 2019 by curryja

by Ross McKitrick

Ben Santer et al. have a new paper out in Nature Climate Change arguing that with 40 years of satellite data available they can detect the anthropogenic influence in the mid-troposphere at a 5-sigma level of confidence. This, they point out, is the “gold standard” of proof in particle physics, even invoking for comparison the Higgs boson discovery in their Supplementary information.

FIGURE 1: From Santer et al. 2019

Their results are shown in the above Figure. It is not a graph of temperature, but of an estimated “signal-to-noise” ratio. The horizontal lines represent sigma units which, if the underlying statistical model is correct, can be interpreted as points where the tail of the distribution gets very small. So when the lines cross a sigma level, the “signal” of anthropogenic warming has emerged from the “noise” of natural variability by a suitable threshold. They report that the 3-sigma boundary has a p value of 1/741 while the 5-sigma boundary has a p value of 1/3.5million. Since all signal lines cross the 5-sigma level by 2015, they conclude that the anthropogenic effect on the climate is definitively detected.

I will discuss four aspects of this study which I think weaken the conclusions considerably: (a) the difference between the existence of a signal and the magnitude of the effect; (b) the confounded nature of their experimental design; (c) the invalid design of the natural-only comparator; and (d) problems relating “sigma” boundaries to probabilities.

(a) Existence of signal versus magnitude of effect

Suppose you are tuning an old analog receiver to a weak signal from a far-away radio station. By playing with the dial you might eventually get a good enough signal to realize they are playing Bach. But the strength of the signal tells you nothing about the tempo of the music: that’s a different calculation.

In the same way the above diagram tells us nothing about the magnitude of the temperature effect of greenhouse gases on the climate. It only shows the ratio of two things: a measure of the rate of improvement over time of the correlation between observations and models forced with natural and anthropogenic forcings, divided by a measure of the standard deviation of the same measure under a “null hypothesis” of (allegedly) pure natural variability. In that sense it is like a t-statistic, which is also measured in sigma units. Since there can be no improvement over time in the fit between the observations and the natural-only comparator, any improvement in the signal raises the sigma level.

Even if you accept Figure 1 at face value, it is consistent with there being a very high or very low sensitivity to greenhouse gases, or something in between. It is consistent, for instance, with the findings of Christy and McNider, also based on satellite data, that sensitivity to doubled GHG levels, while positive, is much lower than typically shown in models.

(b) Confounded signal design

According to the Supplementary information, Santer et al. took annually-averaged climate model data based on historical and (RCP8.5) scenario-based natural and anthropogenic forcings and constructed mid-troposphere (MT) temperature time series that include an adjustment for stratospheric cooling (i.e. “corrected”). They averaged all the runs and models, regridded the data into 10 degree x 10 degree grid cells (576 altogether, with polar regions omitted) and extracted 40 annual temperature anomalies for each gridcell over the 1979 to 2018 interval. From these they extracted a spatial “fingerprint” of the model-generated climate pattern using principal component analysis, aka empirical orthogonal functions. You could think of it as a weighted average over time of the anomaly values for each gridcell. Though it’s not shown in the paper or the Supplement, this is the pattern (it’s from a separate paper):

FIGURE 2: Spatial fingerprint pattern

The gray areas in Figure 2 over the poles represent omitted gridcells since not all the satellite series cover polar regions. The colors represent PC “loadings” not temperatures, but since the first PC explains about 98% of the variance, you can think of them as average temperature anomalies and you won’t be far off. Hence the fingerprint pattern in the MT is one of amplified warming over the tropics with patchy deviations here and there.

This is the pattern they will seek to correlate with observations as a means of detecting the anthropogenic “fingerprint.” But it is associated in the models with both natural and anthropogenic forcings together over the 1979—2018 interval. They refer to this as the HIST+8.5 data, meaning model runs forced up to 2006 with historical forcings (both natural and anthropogenic) and thereafter according to the RCP8.5 forcings. The conclusion of the study is that observations now look more like the above figure than the null hypothesis (“natural only”) figure, ergo anthropogenic fingerprint detected. But HIST+8.5 is a combined fingerprint, and they don’t actually decompose the anthropogenic portion.

So they haven’t identified a distinct anthropogenic fingerprint. What they have detected is that observations exhibit a better fit to models that have the Figure 2 warming pattern in them, regardless of cause, than those that do not. It might be the case that a graph representing the anthropogenic-only signal would look the same as Figure 1, but we have no way of knowing from their analysis.

(c) Invalid natural-only comparator

The above argument would matter less if the “nature-only” comparator controlled for all known warming from natural forcings. But it doesn’t, by construction.

The fingerprint methodology begins by taking the observed annual spatial layout of temperature anomalies and correlates it to the pattern in Figure 2 above, yielding a correlation coefficient for each year. Then they look at the trend in those correlation coefficients as a measure of how well the fit increases over time. The correlations themselves are not reported in the paper or the supplement.

The authors then construct a “noise” pattern to serve as the “nature-only” counterfactual to the above diagram. They start by selecting 200-year control runs from 36 models and gridding them in the same 10×10 format. Eventually they will average them all up, but first they detrend each gridcell in each model, which I consider a misguided step.

Everything depends on how valid the natural variability comparator is. We are given no explanation of why the authors believe it is a credible analogue to the natural temperature patterns associated with post-1979 non-anthropogenic forcings. It almost certainly isn’t. The sum of the post-1979 volcanic+solar series in the IPCC AR5 forcing series looks like this:


This clearly implies natural forcings would have induced a net warming over the sample interval, and since tropical amplification occurs regardless of the type of forcing, a proper “nature-only” spatial pattern would likely look a lot like Figure 2. But by detrending every gridcell Santer et al. removed such patterns, artificially worsening the estimated post-1979 natural comparator.

The authors’ conclusions depend critically on the assumption that their “natural” model variability estimate is a plausible representation of what 1979-2018 would have looked like without greenhouse gases. The authors note the importance of this assumption in their Supplement (p. 10):

“Our assumption regarding the adequacy of model variability estimates is critical. Observed temperature records are simultaneously influenced by both internal variability and multiple external forcings. We do not observe “pure” internal variability, so there will always be some irreducible uncertainty in partitioning observed temperature records into internally generated and externally forced components. All model-versus-observed variability comparisons are affected by this uncertainty, particularly on less well-observed multi-decadal timescales.”

As they say, every fingerprint and signal-detection study hinges on the quality of the “nature-only” comparator. Unfortunately by detrending their control runs gridcell-by-gridcell they have pretty much ensured that the natural variability pattern is artificially degraded as a comparator.

It is as if a bank robber were known to be a 6 foot tall male, and the police put their preferred suspect in a lineup with a bunch of short women. You might get a confident witness identification, but you wouldn’t know if it’s valid.

Making matters worse, the greenhouse-influenced warming pattern comes from models that have been tuned to match key aspects of the observed warming trends of the 20th century. While less of an issue in the MT layer than would be the case at the surface, there will nonetheless be partial enhancement of the match between model simulations and observations due to post hoc tuning. In effect, the police are making their preferred suspect wear the same black pants and shirt as the bank robber, while the short women are all in red dresses.

Thus, it seems to me that the lines in Figure 1 are based on comparing an artificially exaggerated resemblance between observations and tuned models versus an artificially worsened counterfactual. This is not a gold standard of proof.

(d) t-statistics and p values

The probabilities associated with the sigma lines in Figure 1 are based on the standard Normal tables. People are so accustomed to the Gaussian (Normal) critical values that they sometimes forget that they are only valid for t-type statistics under certain assumptions, that need to be tested. I could find no information in the Santer et al. paper that such tests were undertaken.

I will present a simple example of a signal detection model to illustrate how t-statistics and Gaussian critical values can be very misleading when misused. I will use a data set consisting of annual values of weather-balloon measured global MT temperatures averaged over RICH, RAOBCORE and RATPAC, the El-Nino Southern Oscillation Index (ESOI – pressure based version), and the IPCC forcing values for greenhouse gases (“ghg” comprising CO2 and other), tropical ozone (“o3”), aerosols (“aero”), land use change (“land”), total solar irradiance (“tsi”) and volcanic aerosols (“volc”). The data run from 1958 to 2017 but I only use the post-1979 portion to match the Santer paper. The forcings are from IPCC AR5 with some adjustments by Nic Lewis to bring them up to date.

A simple way of investigating causal patterns in time series data is using an autoregression. Simply regress the variable you are interested in on itself aged once plus lagged values of the possible explanatory variables. Inclusion of the lagged dependent variable controls for momentum effects, while the use of lagged explanatory variables constrains the correlations to a single direction: today’s changes in the dependent variable cannot cause changes in yesterday’s values of the explanatory variables. This is useful for identifying what econometricians call Granger causality: when knowing today’s value of one variable significantly reduces the mean forecast error of another variable.

My temperature measure (“Temp”) is the average MT temperature anomaly in the weather balloon records. I add up the forcings into “anthro” (ghg + o3 + aero + land) and “natural” (tsi + volc + ESOI).

I ran the regression Temp = a1 + a2* l.Temp + a3*l.anthro +a4* l.natural where a lagged value is denoted by an “l.” prefix. The results over the whole sample length are:

The coefficient on “anthro” is more than twice as large as that on “natural” and has a larger t-statistic. Also its p-value indicates a probability of detection if there were no effect of 1 in 2.4 billion. So I could conclude based on this regression that anthropogenic forcing is the dominant effect on temperatures in the observed record.

The t-statistic on anthro provides a measure much like what the Santer et al. paper shows. It represents the marginal improvement in model fit based on adding anthropogenic forcing to the time series model, relative to a null hypothesis in which temperatures are affected only by natural forcings and internal dynamics. Running the model iteratively while allowing the end date to increase from 1988 to 2017 yields the results shown below in blue (Line #1):

FIGURE 4: S/N ratios for anthropogenic signal in temperature model

It looks remarkably like Figure 1 from Santer et al., with the blue line crossing the 3-sigma level in the late 90s and hitting about 8 sigma at the peak.

But there is a problem. This would not be publishable in an econometrics journal because, among many other things, I haven’t tested for unit roots. I won’t go into detail about what they are, I’ll just point out that if time series data have unit roots they are nonstationary and you can’t use them in an autoregression because the t-statistics follow a nonstandard distribution and Gaussian (or even Student’s t) tables will give seriously biased probability values.

I ran Phillips-Perron unit root tests and found that anthro is nonstationary, while Temp and natural are stationary. This problem has already been discussed and grappled with in some econometrics papers (see for instance here and the discussions accompanying it, including here).

A possible remedy is to construct the model in first differences. If you write out the regression equation at time t and also at time (t-1) and subtract the two, you get d.Temp = a2* l.d.Temp + a3*l.d.anthro +a4*l.d.natural, where the “d.” means first difference and “l.d.” means lagged first difference. First differencing removes the unit root in anthro (almost – probably close enough for this example) so the regression model is now properly specified and the t-statistics can be checked against conventional t-tables. The results over the whole sample are:

The coefficient magnitudes remain comparable but—oh dear—the t-statistic on anthro has collapsed from 8.56 to 1.32, while those on natural and lagged temperature are now larger. The problem is that the t-ratio on anthro in the first regression was not a t-statistic, instead it followed a nonstandard distribution with much larger critical values. When compared against t tables it gave the wrong significance score for the anthropogenic influence. The t-ratio in the revised model is more likely to be properly specified, so using t tables is appropriate.

The corresponding graph of t-statistics on anthro from the second model over varying sample lengths are shown in Figure 4 as the green line (Line #2) at the bottom of the graph. Signal detection clearly fails.

What this illustrates is that we don’t actually know what are the correct probability values to attach to the sigma values in Figure 1. If Santer et al. want to use Gaussian probabilities they need to test that their regression models are specified correctly for doing so. But none of the usual specification tests were provided in the paper, and since it’s easy to generate a vivid counterexample we can’t assume the Gaussian assumption is valid.


The fact that in my example the t-statistic on anthro falls to a low level does not “prove” that anthropogenic forcing has no effect on tropospheric temperatures. It does show that in the framework of my model the effects are not statistically significant. If you think the model is correctly-specified and the data set is appropriate you will have reason to accept the result, at least provisionally. If you have reason to doubt the correctness of the specification then you are not obliged to accept the result.

This is the nature of evidence from statistical modeling: it is contingent on the specification and assumptions. In my view the second regression is a more valid specification than the first one, so faced with a choice between the two, the second set of results is more valid. But there may be other, more valid specifications that yield different results.

In the same way, since I have reason to doubt the validity of the Santer et al. model I don’t accept their conclusions. They haven’t shown what they say they showed. In particular they have not identified a unique anthropogenic fingerprint, or provided a credible control for natural variability over the sample period. Nor have they justified the use of Gaussian p-values. Their claim to have attained a “gold standard” of proof are unwarranted, in part because statistical modeling can never do that, and in part because of the specific problems in their model.


139 thoughts on “Critique of the new Santer et al. (2019) paper

  1. A good example of getting down in the weeds on a statistical model’s claims. I don’t really know statistics well enough, but explaining just how arbitrary the “natural causes” comparison is is illuminating of the sort of problems models have.

    • The whole concept is bogus. Models are TUNED to fit the climate record by tweeking innumerable poorly constrained parameters. This gives modellers quite a lot of lee-way to choose high sensitivity configurations by convenient parameter choices.

      Hansen et al 2002 states quite openly that “models still can be made to yield a wide range of sensitivities by altering model parameterizations”. Indeed that is what they did when they abandoned earlier physics based work on volcanic forcing in favour of arbitrary tweaking in order to reconcile model output with the climate record.

      This means that their “natural forcing” runs are equally convenient mythologies, based more on their own biases and expectations than any scientific reality.

      If that is their “control” for these calculations the result in simply one of induction reflecting modelers’ choices. A strong AGW was built into the models largely by design. This paper simply affirms that is what models do. Bringing the satellite data into it is simply a red scarf trick.

      The Hansen paper and tropical climate sensitivity was discussed in detail here:

      • “Models are TUNED to fit the climate record by tweeking innumerable poorly constrained parameters”

        Also by tweeking the climate record.

    • As Jamal Munshi has eloquently demonstrated numerous time, cross plotting two or more time-varying parameters against each other can lead to spurious correlations that have no basis in reality. One of Munshi “classics” is a statistical analysis showing that the increase in global atmospheric temperature can be rather strongly correlated to an increase in UFO reports.

      In other words, statistics can be (mis)used to show correlation that does translate to causation. One must always consider phase lead/lag between any two correlated time-varying parameters to establish causation, independent of the p-level or sigma-level in the analysis.

      Dr. Curry is aware of this, as evidenced in her paragraphs discussing “investigating causal patterns in time series data is using an autoregression” and her mention of “Granger causality.”

      With their emphasis on just the S/N ratio improvement of their “fingerprint” correlations, there is little evidence that Santer,, were aware of this issue.

      • FYI – Ross McKitrick is the author of the critique of the Santer paper, not Curry

        • Sorry, you are correct . . . my bad. Dr. McKitrick wrote the above article, which was re-published from Dr. Curry’s website.

          Thank you for the correction, and kudos to Dr. McKitrick for his critical analysis of the Santer,, paper.

    • You know it is b/s when the poles haven’t been taken into consideration.Without Arctic warming there is not enough warming to exceed the error bands on the measurements.

  2. If there’s a dangerous anthropogenic signal after over half a doubling of the effects of CO2 (logarithmically-speaking), why does Santer even need to do this? Why doesn’t he just point at it?

    • “If your experiment needs statistics, you ought to have done a better experiment.”
      Sir Ernest Rutherford

    • Oh yes, the de rigueur “another nail…” statement as if it were the gospel truth.
      One of these days someone should publish a “Removing the nails from…” paper which attempts a meta review of faulty climate claims/papers.
      It could also be a Blog.

      Oh wait!

  3. I will discuss four aspects of this study which I think weaken the conclusions considerably:

    …all you need is one……..Ben Santer

  4. “…. they can detect the anthropogenic influence in the mid-troposphere at a 5-sigma level of confidence.”
    As one of the older WUWT erudite regulars (G. E. Smith, sadly I have not seen for some time) said ‘statistics is just a numerical origami’, or to coin a phrase ‘statistical confirmation is like bees wax, smell nice but don’t expose to strong sunlight’.

  5. 😎
    Anybody else find it odd that the claims of “settled science” a few decades ago keeps needing another shake or two to make it really settled science”?
    What do they they say on cereal boxes? “Sold by weight, not by volume”.
    As the years go by it becomes clearer and clearer that climate “science” carries less and less scientific “weight” so they keep increasing the volume.

    • Science is only settled to the extent that you have practically applied your theories about a system to that system, and the application then works. It”s not “settled” to any further extent than that. Being able to accurately tell people where and when they need to go to see a total solar eclipse, and how long it lasts, demonstrates a level of understanding of how gravity works. Being able to build an atomic bomb demonstrates a level of understanding over nuclear physics. Being able to build a power transmission grid demonstrates a level of understanding of electromagnetics. The fact that we can’t accurately predict the behavior of upcoming solar cycles tells us that something about gravity, or nuclear physics, or electromagnetics still needs to be ironed out or “settled.”

      The corollary is that if you’ve never done anything with your purported knowledge, except write and speak that you “know” it, it’s not really “settled” at all. The manner in which apply your “understanding” is the dividing line between what you really know about a system and what you only think you know about it. So in a very literal sense, we can say that climate scientists know nothing of what they think they know, because they have never been able to reliably put their theories about climate to any demonstrably useful result.

      All they offer are cheap, useless words.

      • Science is never settled, but the corollary to a useful scientific hypothesis is that future predictions you make are not refuted by experimental data. The more times your hypothesis makes predictions confirmed by experimental data, the stronger it is. Equally the more attempts to falsify the hypothesis are unsuccessful, the greater the likelihood of it being true.

        Many of the climate communities’ computer models have already been debunked as inaccurate through failing to predict future climate evolution correctly.

        The model simulations I would most like to see compare two scenarios:

        1. Status quo consumption of hydrocarbons and status quo destruction of rainforests.
        2. Status quo consumption of hydrcarbons and dedicated reafforestation, greening of savannah and deserts, more effective hydrology management and dedicated greening of urban environments.

        That way, you can separate hydrocarbon consumption from environmental degradation, since the two are entirely unrelated activities, with some overlap where extraction technologies involve land damage.

        This is I think the analysis the majority of ordinary folks want to see, since they would like to continue using cars whilst seeing sensible environmental commitments being promoted with alacrity.

        • “Science is never settled.”

          The acceleration of gravity at the Earth’s surface is approximately 9.8 meters per second squared. That’s settled. That Faraday’s Law of Induction accurately describes the relationship between magnetic and electric fields is settled. My point is that, until you actually apply what you think you know about a system to do something that works, you don’t really know what you think you know. You just think you know it. And when what you try to do does work, those applications are the only actual measurement of the extent or bounds of your understanding.

          Let’s put it this way – if an assertion that something is scientifically settled requires persuasion or argument to show that it’s settled, it’s not settled. Most people know nothing about Faraday’s Law of Induction, but they can walk over to their wall, flip a switch and the lights come on. So whatever the science is behind power generation, it has to be right (or settled), and even those who know nothing about the details won’t ever question it. Absent some kind of objective scientific accomplishment in the field of climate science, nothing about it can ever be said to be settled. You can make any kind of models as you want, but until those models start telling us in advance what then actually comes to pass, the models are useless.

          • >>
            Most people know nothing about Faraday’s Law of Induction, but they can walk over to their wall, flip a switch and the lights come on. So whatever the science is behind power generation, it has to be right (or settled), and even those who know nothing about the details won’t ever question it.

            Faraday’s law is included in Maxwell’s equations. Maxwell’s equations are only approximately true. QED is more correct. So, Faraday’s law isn’t really settled science–is it?


          • “Faraday’s law is included in Maxwell’s equations. Maxwell’s equations are only approximately true . . . So, Faraday’s law isn’t really settled science–is it?.”

            As I said, applications of scientific understanding define the extent to which your scientific understanding of a subject is settled. That there may be more to learn does not refute that what has been actually demonstrated is known.

            When a climate scientist says that “the science is settled” they aren’t trying to argue that there is nothing further to learn – they are just trying to stake out a perimeter around something that they don’t want questioned and then moving on. I think you are looking at the term “settled” in the first of these two senses, whereas I am attacking the second.

            With respect to Maxwell’s generalized set of equations that together try to encapsulate the complete relationship between electric/magnetic fields and fluxes, the fact that the set of equations may lead to incorrect results in certain limited circumstances, due to quantum effects for example, doesn’t necessarily mean that Faraday’s law in particular is just an approximation. That Ohms law may break down at very high voltages or currents does not give me cause to questions the resistances of materials listed in the Handbook of Chemistry and Physics for voltages and currents within the ranges over which those two variables are linear for a material.

  6. This paper demonstrates that modern software has made statistical analysis available to the masses but not many are expert in their application and significance.

    It would be interesting to see comparative statistical work for postulated climate drivers such as solar activity and cosmic rays.

    • Most modern statistical packages include a test of whether or not your data can be considered Gaussian or not. With the caveat that the results cannot be used if the data are not sufficiently Gaussian. I suspect Santer wrote his own custom code …

  7. Not to forget that weather balloon data doesn’t show any clear indication of this effect

  8. To my mind, Santer’s paper has only one purpose. It allows non-experts to claim their climate alarm is supported (in the peer reviewed literature) to the standards expected of quantum physics. The baton is being passed from the claims of 97% consensus.

    The fact that Santer’s paper is so obscure that non-experts will find it impenetrable is just fine. That just makes it makes impossible to debate among non-experts.

    Ross, your analysis is very helpful. But non-experts needs the simplest possible refutation you can muster.

    • Natural + Anthro = all warming

      fudge factors + fiddlers constant + CO2 = Anthro warming

      natural + (Fudge factors + fiddlers constant + CO2) = all warming

      Therefore use what ever statistics we can to show that because we defined Anthro as large and natural as small we can now see a signal to 5 Sigma of Anthro.

      Swap Natural to large in the first equation and Anthro to small, and the same analysis will find a high natural signal to 5 Sigma


      • Precisely, the biggest problem is that they assumed what they claimed to show. Let me define the natural components and you will get a completely different result.

  9. Varotsos and Efstathiou, 2019

    “An identifiable signature of [anthropogenic] global warming is the combination of tropospheric heating and stratospheric cooling leading to an increase in the height of tropopause. However, according to Figure 1 this is not the case, because the TA [tropopause air] trend at the tropopause is near zero.”

    “In summary, the tropospheric temperature has not increased over the last four decades, in both hemispheres, in a way that is more amplified at high latitudes near the surface. In addition, the lower stratospheric temperature did not decline as a function of latitude. Finally, the intrinsic properties of the tropospheric temperature are different from those of the lower stratosphere. Based on these results and bearing in mind that the climate system is complicated and complex with the existing uncertainties in the climate predictions, it is not possible to reliably support the view of the presence of global warming in the sense of an enhanced greenhouse effect due to human activities.”

  10. You really have to wonder if Santer et al are actually unaware of the statistical pitfalls of their analysis as Dr Curry pointed out, that is, if they really are just incompetent, or if they are quite familiar with the errors that can arise from abusing the statistical analysis and are knowingly constructing a false narrative.

  11. How could anyone detect anthropogenic warming and distinguish it from natural variability when no one knows what the natural variability is?

    I suspect Santer’s paper is meaningless pseudo-scientific physico-babble.

    • Dubious to reply to myself, but, Lubos Motl, sophisticated physicist, said the exact same thing I did. “Since natural variability is not known, how can it be compared to AGW? Cannot.

  12. Any model that is linear and relates CO2 to Temperature is bound to fail. Temp =/= f(CO2), Temp = f(Log(CO2)), in reality the model shouldn’t even be Temp = f(CO2), it should be Temp = f(W/M^2). There are plenty of things that alter the W/M^2 other than CO2, namely water vapor, clouds, cosmic rays and solar radiation. A focus on CO2 simply allows the climate alarmists to claim victory. They are getting everyone to chase their tails.

    Hockeystick Con Job; CO2 Can’t Cause Temperature Dog-Legs

  13. How is their claim actually just a rather elaborate circular logic circuit? They claim they can detect the human forcing, by in effect, finding everything that is not a natural forcing. But if they know the natural forcing , obviously everything left is human. They are still where they have always been, proving they know what the natural forcing comes to. They always seem to gloss over that piece, probably because they don’t want to admit they have no way to prove what is natural.

  14. Elaborate statistics aren’t necessary to see that the claim of Santer et al, glibly embraced by Nature, is nonsense. The warmer temperature that has existed for the last 2 decades in the satellite record resulted from a brief interval of warming before the 1996 El Nino – just a couple of years. [16:00]

    Such brief changes have been recognized previously.

    Salby also showed that (A) only two such brief intervals, each preceding El Ninos, account for nearly all of the increased temperature over the 20th century and (B) that the resulting warmer temperature is random, unconnected to the systematic increases of CO2.

    So much for Nature.

  15. Thank you Ross. Once again we have a paper undone by implicit assumptions. I have the nasty suspicion that most of the people who invoke statistics don’t know, much less understand, the assumptions they’re making.

  16. As someone who followed particle physics for many years, I find the Santer, attempt to imply their “climate science” is as rigorous as the physics done at CERN, Fermi and other real science labs an attempt to manufacture undeserved credibility. I do have statistical training, and find Mr. McKitrick’s critique valid and convincing.

  17. The “police lineup” analogy pretty much tells the tale. Thanks for the hard work!


  18. Global Climate Change is a ruse to keep your mind off the #1 Problem on Planet Earth: Islam.

      • My comment was in reply to David Dirkse, who’s posts have been removed. It was not in reply to Stanny1.

      • There is less global poverty at present than any other time in history. Almost all of the present poverty is due to bad governments, not lack of food growing capacity or natural resources. The highest poverty is in countries with socialist type economies (even though the type of government may be so called democratic, it is the type of economy that is important).

        • You probably mean “Less Religion and less bad politic would result in less poverty, or?

          • I said what I meant and I meant what I said.
            Islam is pretty much the only religion that results in poverty. Your belief that all religion is bad just demonstrates your own biases.
            Bad politics can hurt, however socialism always reduces the total wealth of a country.

      • Sorry to be brutally honest it is only the number one problem to those in poverty, for most in the developed world it is a problem when they think about it in a guilt moment. If you ran a survey in a developed country I doubt it rates as the number one concern.

    • I, and other free thinking humans can only hope you become a statistic of this ideology. Your idiotic statement is thematic of the leftist contagion and says a lot about your beliefs on global warming and inability to interpret reality.

  19. I thought the journals had decided to include a review by professional statisticians before publication.

    • “Professional Statisticians” is a term like “Climate Scientists”, I am immediately wary of anyone who claims to be either, since there is no examination, Hippocratic oath, or risk of being struck-off.

      In fact there is reverse risk of being struck off, i.e. shunned for not being sufficiently alarmist.

  20. This post goes down a trail in which I’ve become very interested lately: that of correct error propagation and significant digits. Since Mr. McKitrick is a professional statistician, I would love to hear some commentary regarding how the subject is handled in the temperature calculations we see every month.

    Everything I’ve read on the subject online at various university science departments indicates to me that these fundamentals are being ignored at every level. For example, starting with temperatures measured in tenths of degrees being resolved into four-and-five decimal points of precision. We’ve had many discussions here about the Law of Large Numbers and lowering the error in the mean by its use. However, that ability lies in having many thousands of measurements for N, and I don’t see where they can come from.

    Take the daily TMIN and TMAX from a station, and it’s averaged to get TAVG. If you start with 35.3°C and 17.6°C, TMIN should be 26.4, and not 26.45, because the rules I’ve seen everywhere say you can’t have more precision in the result than you had in the least precise measurement. That is violated constantly in the results of many papers and reports, and it makes me wonder if the authors are uncaring, unaware, or if there are other methods.

    For example, I calculated a 30-year baseline series for Sky Harbor in Phoenix, and the January average I calculated to be 13.6 +/- 0.1 °C. The Jan. average for the year 2015 was 14.8 +/- 0.6°C The amomaly for that month was 1.2 +/- 0.6°C, because every rule I’d seen said that when adding or subtracting two values with uncertainties, the resulting uncertainty was *radic;(Δx² + Δy²).

    That gives me the 0.6 uncertainty, but is it correct?

    • Ah, my poor memory does me in again. Thanks for the correction, upper-case though it was.

      I’d still be interested in hearing his thoughts on the subject.

    • “McKitrick is an economist.”

      Can you possibly name a field that is more statistics intensive than Economics? The multi-variable statistical techniques used in climate models were developed for the field of economics. The program that makes all this possible is SASS; Statistical Analysis for the Social Sciences.

      • I’m not gonna argue the point; if he’s not a statistician , he’s not a statistician. He does have demonstrated ability in the realm, and probably knows more about the topic than any one of us. Don’t let the perfect be the enemy of the good.

      • “Theories of economics are based on building logical deductive systems based on given sets of axioms. Very little statistics involved.”

        You just proved to everyone that you have absolutely no idea what you are talking about. Economics was the foundation of the statistics that support the multivariable linear regression modeling used by Wall Street and Climate Modelers. Economics is by far one of, if not the most statistically intensive fields out there. They attempt to model entire economies and markets. As I said, SASS was developed to solve the problems faced in the field of economics where you have to “control” using statistical techniques instead of test tubes and laboratories.

        BTW, even the best can’t model the Stock Market 1 week in the future, and Climate Alarmists claim they can use the same techniques to model the Climate 100 years out. That alone proves they live in a total fantasy world.

      • That is not quite what economists do in practice, unless one is discussing pure theory and the teaching therof. In practice, it is more econometrics, of devising and running models of what part or more of the economy has or will do in the real world.
        There is an obvious overlap in skill sets between climate science in the modeling sense and economics.

      • We find another field in which David actually believes himself to be knowledgeable in, yet he is clearly quite ignorant.

    • McKitrick is an economist but his knowledge is well above that of most “climate scientists”. Similar for Steve McIntyre.

    • Bottom line @PG, they (Past Interglacials Working Group of PAGES) feel it will be 50,000 years before this interglacial period ends due to the CO2 levels… and it took 218 pages to say that.

      • I did not see that statement. I also contend that they did not provide convincing evidence that current CO2 levels, using current measurements, would have the ability to put off the next stadial. It was more convincing that orbital mechanics were not set up yet for the next stadial and could take 10s of thousands of years before they are set up to initiate a stadial.

        • Page 207 8.2 Next Glacial Inception: These results show consistently, that a glacial inception is unlikely to happen within the next approximate 50 ka (when the next strong drop in Northern Hemisphere summer insolation occurs) if either atmospheric CO2 concentration remains above 300ppm or cumulative carbon emissions exceed 1000 Pg C [Loutre and Berger, 2000; Archer and Ganopolski, 2005; Cochelin et al., 2006]. Only for an atmospheric CO2 content below the preindustrial level may a glaciation occur within the next 10 ka [Loutre and Berger, 2000; Cochelin et al., 2006; Kutzbach et al., 2011; Vettoretti and Peltier, 2011; Tzedakis et al., 2012a].

          Reads like a group hug

          • Thanks. I still don’t see the issue. They are basing their speculation on the correlation of stadial periods to orbital mechanics and CO2 levels from the past. It certainly stands to reason that as long as orbital mechanics driven insolation keeps us at the interstadial peak, CO2 levels will continue to be beneficial and high as the Earth remains green or greener.

            And to be clear, I doubt that what little amount of anthropogenic CO2 we are producing will have any effect one way or another.

          • State of Fear, Page 563 Kenner: The nasty little apes that call themselves human beings can do nothing but run and hide. For these same apes to imagine they can stabilize the atmosphere is arrogant beyond belief. They can’t control the climate.

            They know that, it is not about controlling climate, it is about control of people.

      • If that were true, what a fantastic stroke of luck it would be for all of humanity and, in fact, all life forms.

        Unfortunately, we now pretty much know the activity, or lack thereof, of the wimp that is CO2 at current levels.

        • Very well said. With a cautionary note to not mix CO2 measurements in ice cores with direct CO2 measurements in the atmosphere, higher CO2 levels in ice cores were not able to prevent powerful and deadly stadial periods. The tiny anthropogenic fraction of a small fraction of total atmospheric CO2 ppm doesn’t have a chance in hell of preventing that next jagged fall to killing cold, which is estimated to be about 50,000 years away give or take when orbital mechanics will set us up for that jagged fall. The article I linked to, if anything, reiterates that Mother Nature remains in charge and will eventually shake off a great deal of flora and fauna once again.

  21. So Santer et al. think they have demonstrated an anthropogenic signal in the satellite data –so what.
    I think most climate rationalist would agree that there is probably some anthropogenic effect but as Ross McKitrick says they haven’t shown the magnitude of that effect, regardless of the flaws in their analysis.
    The paper may also be timed to take advantage of the effect of 2016 El Nino in the data, it’s a ‘straw man’ paper to capture publicity.

  22. A proper review of the paper would recommend that the estimated natural forcing be varied and the test repeated to see what the effect was. I suspect that it will be large, and will invalidate the results. If Santer has shared the code, this could be done fairly easily. He could do some science and do this test as well. Since it all depends (and he admits it) on how you slice the natural and the anthropogenic contributions, this must be done to verify his results. Just one more thought – we don’t know what has caused the natural warming. The world has warmed significantly during the early 20th century, but we don’t know why. CO2 was fairly constant and not responsible. As far as we can tell, the sun’s output has been too steady to cause this. Therefore, it is a bit premature to perform this kind of analysis when we don’t even understand what drives natural warming or cooling.

  23. anyone who publishes a paper based on the 8.5 senario is no longer a scientist but is a kool-aid drinker in the church of global warming

  24. 1. The models cannot reproduce observations.
    2. This paper relies on models to produce data for which we have no observations (the temperature of earth as it would have been without human co2 emissions)

    This is just circular reasoning. The model output doesn’t match the observational data we do have, so using that same flawed models to stand in for observational data we don’t have, is absurd.

    Before the 97% consensus mantra turned into the gold standard mantra, they used to simply run the models with and without anthropogenic forcing to calculate the magnitude of the “signal”. How is this different?

    • Santer et al. essentially compares:
      MODELED natural variability N (which the models get wrong) with
      MODELED human impact H (using the same flawed models, which also get that part wrong).
      The observations(O) are used as a yardstick, but are not needed! Keep your eye on the ball. It is a sleigh of hand trick.
      They check O versus N and O versus (N+H), which amounts to comparing N to N+H. That means that O is not even needed.

      And both N and H are just MODELED and they (a priori) MODEL the H part to be significant when compared to N, hence N+H will be significantly different from N as well.

      What matters is that the assumption to their argument is that they modeled the H part right by making it significant when compared to N, and then BASED ON THAT conclude that N+H differs significantly from N. Using their assumptions that conclusion would always follow. Circular reasoning indeed.
      Their conclusion solely rests on their assumptions, regardless of bad statistics.

  25. “CO2islife, when and if you can get back to me on the quantifiable effect of clouds, we can talk. As it stands today, you cannot assert the effect being net positive or net negative on W/M^2. Since cosmic rays have no effect on W/M^2 in a cloudless environment, you cannot assert that they (cosmic rays) have ANY effect on W/M^2.”

    1) You State that “Clouds cool during the day, and insulate during the night.”
    2) Clouds alter W/M^2
    3) Cosmic Rays trigger cloud formation
    4) I never said Cosmic Rays hitting the desert does anything

    What part of that simple logic don’t you understand?

    • “What part of simple arithmetic do you not understand Mr. CO2isLife?”

      Are you for real?

      “If cosmic rays affect cloud formation, then at night these clouds insulate.”

      Yep, that would alter W/M^2

      “If cosmic rays affect cloud formation, then during the day these clouds cool.”

      Yep, that would alter W/M^2

      “So, please tell us what is the arithmetical sum of insulation at night plus cooling during the day?”

      That assumes the clouds last at least 24 hours and remain in the same location and altitude and density. They don’t.

      “Is it net zero, net insulate or net cool?”

      That assumes the warming and cooling effect are identical, they aren’t.

      Strike 1, Strike 2, Strike 3 you’re out. Your critical thinking and reasoning skills lead me to believe you must have majored in Elementary Art or PhysEd.
      are truly lacking.

    • “What part of simple arithmetic do you not understand Mr. CO2isLife?”

      None of it.

      “If cosmic rays affect cloud formation, then at night these clouds insulate.
      If cosmic rays affect cloud formation, then during the day these clouds cool.”

      Both alter W/M^2, which is my point.

      “So, please tell us what is the arithmetical sum of insulation at night plus cooling during the day?”

      You assume the clouds last at least 24 hours, in the same location, at the same altitude and same density and transparency. They don’t. You assume way too much.

      “Is it net zero, net insulate or net cool?”

      You assume the warming which is due to trapping heat and cooling which is due to reflection are equal. They aren’t. Once again, you assume way too much, and your assumptions are wrong.

      • David, is English not your first language?
        CO2isLife did not say what you claim he said. Read the two statements again, then apologize to the class for making a fool of yourself.

      • David, David, David.
        Can’t you just, for once, admit that you were wrong and be a big boy about it?
        This constant need to prove yourself right, even when you are clearly in the wrong just shows how much growing up you still need to do.

      • David, I’m afraid that you flunk basic reading.
        “What part of simple arithmetic do you not understand Mr. CO2isLife?”
        He responded: “None of it.”
        That is the logical equivalent of:
        “What part of simple arithmetic do you understand Mr. CO2isLife?”
        He responded: “All of it.”

      • You misrepresent his answer. Either again an issue with basic reading, or worse. In his answer “They aren’t” does not refer to your question, instead it refers to part of his answer to your question.

        This was the Q&A:
        “Is it net zero, net insulate or net cool?”

        You assume the warming which is due to trapping heat and cooling which is due to reflection are equal. They aren’t. Once again, you assume way too much, and your assumptions are wrong.

        And by the way, he is right. Clouds are not net neutral, they have a clear cooling effect. And are either ignored or otherwise misrepresented by the climate models (as a whole ensemble and by each individual one that I’m aware of; although some may make an attempt).

      • My math is fine. You may want to study your gramar…of which I am no expert, but I understand double negatives.

        When one finds themselves in a hole do you?
        1) Ask for a Ladder?
        2) Ask for a bigger Shovel?

        David, learn to ask for a ladder and stop digging. You’ve lost.

    • Don’t expect David to actually attempt to defend his statements. His forte is just repeating the same claims over and over again. Then when he gets far enough behind, come up with a new unrelated claim and pretend that was his position all along.

    • Are you implying that daytime cloud cooling balances with nighttime cloud insulation on a global scale? Would the heat transfer mechanism consider radiation, convection, conduction, changes in dew point, wind, altitude, changes in cloud cover as a function of time-of-day etc. etc.? I’ve taken Mathematical Physics in grad school and couldn’t solve this problem. Some experts could probably educate us but I doubt if simple arithmetic will give us an answer.

        • I can show Mr Dirske the proof that clouds affect W/m2 where it matters, but he would ignore it.
          I can point out to Mr Dirske that there is much more Power in Solar Energy than in Surface radiated LWIR, so therefore they are a coolant, but he would ignore that as well.
          I could point out that there are lots of things that affect cloud formation and Cosmic Rays appear to be one of them, but he would ignore that as well.

          So at the edn of the day it is best to just ignore him for the bad mannered Troll that he is.

      • ‘…I’m not “implying” anything…’

        Thus far, you’ve implied a high-level of stupidity.

      • “I’m not “implying” anything, I’m merely asking what is the net. We all know daytime clouds cool, and nighttime clouds insulate. What is the NET effect?????”

        David, I’m just curious and trying to figure out how your brain works. Have you ever seen a cloud stationary and unchanging for a 24 hour period? The hardest part of responding to your questions is to try to figure out what you are even asking and what context.

        Clouds can reflect incoming radiation, largely of the high energy visible spectrum, reflect outgoing low energy LWIR, and contain latent heat from condensation and thermalization of outgoing LWIR or incoming visible radiation. How could you possibly assume that those three highly different processes would NET out to anything other than a random number? You seem to imply that one is +2 and the other must be -2. There is no reason they should be equal, they are different mechanisms and involve different wavelengths.

    • “They averaged all the runs and models, regridded the data into 10 degree x 10 degree grid cells (576 altogether, with polar regions omitted) and extracted 40 annual temperature anomalies for each gridcell over the 1979 to 2018 interval. From these they extracted a spatial “fingerprint” of the model-generated climate pattern using principal component analysis, aka empirical orthogonal functions.”

      That’s a lot of arbitrary manipulations. With a shining goal you want to prove, you try hundreds of promising approaches, and, finally, Eureka! We found a five sigma signal!

    • They can do the same by using observational (O) data from any unrelated process.

      They are essentially
      comparing O to N and O to N+H, and then they compare both comparisons: (O to H) versus (O to N+H),
      where N and H are modeled with H (assumed to be, hence modeled to be) significant compared to N.

      This is the same as:
      comparing N to N+H,
      where N and H are modeled with H (assumed to be, hence modeled to be) significant compared to N.

      Their conclusion essentially is that (N+H) differs significantly from N, under the assumption that H is significant in the first place…
      And that is pure circular reasoning, just in a very round-about way.

  26. We have one equation and 2 unknowns. It can never solved. Never. Santer can write as many papers as he likes but he can never solve an unsolvable equation. So it is mad to throw $billions at speculative solutions. The sane logical thing to do is to adapt to whatever temperature we live in

    • They have their protectors in Congress,
      like Schiff for Brains in the House,
      and Marxist Malarky in the Senate.

  27. The views, opinions and findings contained in this report are those of the authors and should not be construed as a position, policy, or decision of the US Government, the US Department of Energy, or the National Oceanic and Atmospheric Administration.


  28. “But by detrending every gridcell Santer et al. removed such patterns, artificially worsening the estimated post-1979 natural comparator.”

    One has to strongly suspect they did it both ways (I do of course). First they did it without de-trending, and they got crap — nothing. Then realized they had to de-trend the “nature-only” natural variability comparator to ID a bank-robber (CO2). This is the equivalent of the IPCC accepting by simple acclamation that anthro CO2 forcing was near 100% the cause of the all the observed warming from 1950 to 2016, and natural climate variability stopped happening after 1950.

    Natural variability (internal cycles that to them are noise) has always been and will always be the elephant in the room eating their banquet salad bowl and crapping on their floor that the modellers must ignore if they want an alarmist story on CO2. Basically Santer, et al has Photoshopped=out the elephant (by de-trending) from the historical evidence, like any good Stalinist propaganda photo erasing things and people that don’t fit the message.

    Even Jim Hansen in 1988 knew they could get away with this contrived anthropogenic-causality through about mid-2010’s. That’s because he knew from 1979-1980 onward for ~35 years there would be a natural warming phase of climate, based on the historical 1910-1945 warm “blip” followed by the 1950-1975 cooling phase. Knowing this they could capitalize on the timing and pin the blame on CO2 forcing. So here Santer takes that deception and runs with it to try and score a knock-out on policy before the natural variability Elephant rears up and destroys the modellers’ party in the next 25 years.

  29. It is useless to go asn alyse detail in this paper that Judy unearthe. It d somehow got published. despite its faults. It should should have been pt sent back tthe authors to bring scales and definitions in fogures 1 and 4 to a comm code. We don’t meed a proliferation of codes like Mocrosdoft dumped on its new Word etc editions ewherre 95 pwecent of the code weords will never find I found Judy’s unmasking of yhat”root” from statistics a a revelation pf what mot tp do with spare cpding space.. Too bad this paper gpt published at all.

  30. If correct this is the take away for me.
    “According to the Supplementary information, Santer et al. took annually-averaged climate model data based on historical and (RCP8.5) scenario-based natural and anthropogenic forcings and constructed mid-troposphere (MT) temperature time series that include an adjustment for stratospheric cooling (i.e. “corrected”).”

    So they Take Model Outputs and the “Correct it” and then use that as a “Proof”?

    They call this Science, the Statistics are totally irrelevent if that is true.

  31. A lot of the statistics is well above my pay grade – for me it might just as well have been written in Mandarin.
    However, one sentence possibly gives a concise definition of what Santer et al actually did:
    Referring to the diagram:
    “It only shows the ratio of two things: a measure of the rate of improvement over time of the correlation between observations and models forced with natural and anthropogenic forcings, divided by a measure of the standard deviation of the same measure under a “null hypothesis” of (allegedly) pure natural variability. ”

    So, the diagram implies that the correlation between observations and the models was improving over time. First, it’s important to remember that these must refer only to model forecasts of the past i.e. when the answer was already known. Despite this, the improving correlation would still be significant – apart from one important aspect: tuning.

    Willis once observed that model tuning might be similar to evolution: tunings that improved the fit with historical data would be kept, while tunings that made the fit worse would be removed. In this way the correlation between the models and historical data would get better and better with time. Also, the author of a recent paper on model tuning specifically stated that a major purpose of tuning was to improve the match with aspects of the historical data.

    The tunings would benefit the full models (natural and CO2 forcing) but not the versions with no CO2 forcing. So, due to tuning, the ratio of correlation between full models and no CO2 models would steadily get better with time. Quite possibly, after many years and the squandering of billions of dollars on these models, their match with what had already happened could become virtually perfect. Result: the graph shown above and a virtually perfect proof that the warming is all our fault!

    Of course, it does nothing of the sort. The graph merely shows that the curve fitting with historical data has been getting better over time. But that’s all it is: a sophisticated and eye-wateringly expensive form of curve fitting. It says absolutely nothing about our understanding of the climate system.

    This new paper appears to be just a more sophisticated version of the tired old IPCC proof: that models with CO2 forcing match the historical data while those with no CO2 forcing don’t. But this “proof” is completely invalidated by the fact that the models have enormous amounts of arbitrary tuning. It’s not just wrong. It’s close to fraudulent.

  32. People are so accustomed to the Gaussian (Normal) critical values that they sometimes forget that they are only valid for t-type statistics under certain assumptions, that need to be tested. …

    This is a very pertinent issue throughout climate change science. People do all sorts of things with statistics that ought to be scrutinized for compliance with assumptions, but are not. The late Petr Beckman refered to this as the “Gaussian disease”.

  33. I suspect I could use there technique to identify a correlation between butterfly wings and surface temperature of Mars.

  34. Surely by now there’s enough evidence proving that CO2 is a minor player in climate (except for the greening). Why the continued discussion? Hasn’t Dr. Henrik Svensmark solved the climate change mystery? If not please you experts attempt to prove him wrong. CO2 is so last year…boring. The IPCC, skeptic science blog, Al Gore and most likely the democrat national committee disagree with his hypothesis but it sounds pretty good to this layman.
    • The coolings and warmings of around 2°C that have occurred repeatedly over the past 10,000 years, as the Sun’s activity and the cosmic ray influx have varied.
    • The much larger variations of up to 10°C occurring as the Sun and Earth travel through the Galaxy visiting regions with varying numbers of exploding stars.
    The authors
    • Dr. Henrik Svensmark, Danish National Space Institute, in the Technical University of Denmark (DTU).

  35. Reading the Santer paper is like watching a good magician — positively correct that I cannot tell you how he did it, but I feel pretty confident in telling you that it is make-believe magic. I’m five-sigma confident.

  36. Fingerprints of ‘natural’ climate variability.

    Rapid AMO warming from the mid 1990’s is covariant with:

    1) A decline in low cloud cover globally, leading to surface warming, and increased upper ocean heat content.

    2) Changes in the vertical distribution of water vapour: Declines in lower stratosphere water vapour leading to cooling. Increases in low-mid troposphere water vapour, at least due to higher SST’s coupled with an increase in surface wind speeds over the oceans, leading to low-mid troposphere warming.

    3) Reduced CO2 uptake in the warmer North Atlantic and in land regions made drier by the warm AMO phase (and increased El Nino).

    All because ocean phases vary inversely to changes in climate forcing.

    Correlations of global sea surface temperatures with the solar wind speed:

Comments are closed.