Sunspots: Labitzke Meets Bonferroni

Guest Post by Willis Eschenbach

In a previous thread here on WUWT, a commenter said that the sunspot-related variations in solar output were shown by Labitzke et al. to affect the stratospheric temperature over the North Pole, viz:

Karin Labitzke cracked that nut. She was the first one to find a correlation between not one but two atmospheric parameters and solar activity. After almost 40 years her findings are still solid, and thanks to her we know that the strength of the polar vortex depends on solar activity modulated by the quasi-biennial oscillation.

And when I went and got the data from the Freie Universität Berlin, I was able to replicate their result. Here’s the relationship Dr. Labitzke et al. found between sunspots and polar stratospheric temperatures:

Figure 1. Sunspots versus north pole stratospheric temperatures. Red line shows the trend.

So … what’s not to like? To lay the groundwork for the answer to that question, let me refer folks to my previous post, Sea Level and Effective N, which discusses the Bonferroni Correction and long-term persistence (LTP).

The Bonferroni Correction is needed when you’ve looked in more than one place or looked more than one time for something unusual. 

For example. Suppose we throw three dice at once and all three of them come up showing fours … that’s a bit suspicious, right? Might even be enough to make you say the dice were loaded. The chance of three 4’s in a single throw of three dice is only about five in a thousand.

But suppose you throw your three dice say a hundred times. Would it be strange or unusual to find three 4’s in one of the throws among them?  Well … no. Actually, with that many tries, you have about a 40% chance of getting three 4’s in there somewhere.

In other words, if you look in enough places or you look enough times, you’ll find all kinds of unusual things happening purely by random chance.

Now in climate science, for something to be considered statistically significant, the odds of it happening by random chance alone have to be less than five in a hundred. Or to put it in the terms commonly used, what is called the “p-value” needs to be less than five hundredths, which is usually written as “p-value < 0.05”.

HOWEVER, and it’s a big however, when you look in more than one place, for something to be significant it needs to have a lower p-value. The Bonferroni Correction says you need to divide the desired p-value (0.05) by the number of places that you’ve looked. So for example, if you look in ten places for some given effect, for the effect to be significant it would have to have a p-value less than 0 05 divided by ten, because ten is the number of places you’ve looked. This means it would have to have a p-value of 0.005 or less to be statistically significant.

So … how many places were examined? To answer that, let me be more specific about what was actually found.

The chart above shows their finding … which is that if you look at the temperature in February, at one of seven different possible sampled levels of the stratosphere, over the North Pole, compared to the January sunspots lagged by one month, during the approximately half of the time when the equatorial stratospheric winds are going west rather than east, the p-value is 0.002.

How many different places have they looked for a relationship? Well, they’ve chosen the temperature of one of twelve months, in one of seven atmospheric levels, with one of three sunspot lag possibilities (0, 1, or 2 months lag), and one of two equatorial stratospheric wind conditions.

That gives 504 different combinations. Heck, even if we leave out the seven levels, that’s still 72 different combinations. So at a very conservative estimate, we’d need to find something with a p-value of 0.05 divided by 72, which is 0.0007 … and the p-value of her finding is about three times that. Not significant.

And this doesn’t even account for the spatial sub-selection. They’re looking just at temperatures over the North Pole, and the area north of the Arctic Circle is only 4% of the planet … which would make the Bonferroni Correction even larger.

That’s the first problem, a very large Bonferroni Correction. The second problem, as I discussed in my post linked to above, is that we have to account for long-term persistence (LTP). After accounting for LTP, the p-value of what is shown in Figure 1 above rises to 0.09 … which is not statistically significant, even without considering the Bonferroni Correction.

To summarize:

  • As Labitzke et al. found, February temperatures at 22 kilometres altitude over the North Pole during the time when the equatorial stratospheric winds are blowing to the west are indeed correlated with January sunspots lagged one month.
  • The nominal p-value without accounting for LTP or Bonferroni is 0.002, which appears significant.
  • However, when you account just for LTP, the p-value rises to 0.09, which is not significant.
  • And when you use the Bonferroni Correction to account just for looking in a host of locations and conditions, you’d need a p-value less than about 0.0007 to be statistically significant.
  • So accounting for either the LTP or the Bonferroni Correction is enough, all by itself, to establish that the claimed correlation is not statistically significant … and when we account for both LTP and Bonferroni, we see that the results are far, far from being statistically significant.

Unfortunately, the kind of slipshod statistical calculation reflected in the study is far too common in the climate debate, on both sides of the aisle …

ADDENDUM: I was lying in bed last night after writing this and I thought “Wait … what??” Here’s the idea that made me wonder—if you were going to look for some solar-related effect in February, where is the last place on Earth you’d expect to find it?

Yep, you’re right … the last place you’d expect to find a solar effect in February would be the North Polar region, where in February there is absolutely no sun at all … doesn’t make it impossible. Just less probable.

Finally, does this mean that the small sunspot-related solar variations have no effect on the earth? Not at all. As a ham radio operator myself (H44WE), I know for example that sunspots affect the electrical qualities of the ionosphere.

What I have NOT found is any evidence that the small sunspot-related solar variations having any effect down here at the surface. Doesn’t mean it doesn’t exist … just that despite extensive searching I have not found any such evidence.

My best regards to all,

w.

PS—As usual, I request that when you comment, you quote the exact words you are referring to, so that we can all be clear about just who and what you are discussing.

Advertisements

163 thoughts on “Sunspots: Labitzke Meets Bonferroni

  1. “Suppose we throw three dice at once and all three of them come up showing fours … that’s a bit suspicious, right?”…

    no not at all…..Hillary won every coin toss………../snark

    • Kerry, I believe the current term is “p-hacking”, looking everywhere for a low p-value until you find one.

      w.

      • Nice brief and well argued article.

        The common term is cherry picking. If this effect is not found in January or March , or any month of the year, it should be clear to anyone with a modicum of reflection that it is accidental, not meaningful result. I’m always very skeptical of studies which pull out winter months or some other subset to produce a “finding”.

        Here’s the idea that made me wonder—if you were going to look for some solar-related effect in February, where is the last place on Earth you’d expect to find it?

        The equinox is in March when the Earth is ‘side-on to the Sun and the Arctic does receive sun. Feb may be cold but it’s not the middle of winter in celestial terms. The stratosphere will get direct insolation anyway, it is the surface which is in shadow.

        Also there is the funneling effect of the magnetic field and a cascade of high energy particles entering the Arctic atmosphere. This will be affected by solar wind etc. It is not unreasonable to propose and atmospheric effect in February. The problem comes with it is ONLY seen in that one month.

  2. A well-known example of similar reasoning is
    “If you have 20 people in a room, what is the chance that two will have the same birthday?”
    The answer is 0.5. If you have 40 people the answer is over 90%.

    However, the Bonferrioni correction requires that all the possible choices are uncorrelated. If they are, the correction is less straight-forward and points to lower p values, simply because one is re-sampling the same effect.

    In this particular case, I’ve no idea of the correlation between the different possibilities but, since they are connected, correlations may well exist and the Bonferrioni correction may over-estimate the p value.

  3. Not significant is not insignificant. Thats the p-value fallacy so common in science today that stifles scientific advance

    • Statistical significance is often incorrectly used to mean significant. A more modern term is medically significant. Eg, drug trials of blood pressure reducing medicines can show that they reduce patient blood pressure by a statistically significant amount. But in medical terms it can be too low to be meaningful.

  4. My biggest smile:

    ADDENDUM: I was lying in bed last night after writing this and I thought “Wait … what??” Here’s the idea that made me wonder—if you were going to look for some solar-related effect in February, where is the last place on Earth you’d expect to find it?

    One of those great observations so obvious that I’m annoyed that I didn’t think of it.

  5. Lack of such correction is a widespread deficiency in climate research. Why, the very last WUWT article before this allegedly attributes wildfire frequency in California to global warming, against the odds.
    We need a quiet citizen scientist rebellion, with a memorable flag noting “Fails Bonferroni Test.”

  6. Is there a correlation between a west QBO, a quiescent sun, and SSW events? I have seen graphs that seem to indicate a correlation.

  7. Willis, I really do appreciate all the unpaid work you do to further true climate science. Skewering skeptic claims is a huge service, as is skewering alarmist claims.

  8. “ADDENDUM: I was lying in bed last night after writing this and I thought “Wait … what??” Here’s the idea that made me wonder—if you were going to look for some solar-related effect in February, where is the last place on Earth you’d expect to find it?”

    I agreed at first but…

    1) There is no doubt that the sun directly influences the earth’s atmosphere via its magnetic field regardless of where the sun shines. The fact that the auroral oval shifts to the “dark side” is just one proof of this.

    2) That the solar radiation constant (measured by satellites perpendicular to the sun) shows a variance of 0.001 regardless of the number of sunspots implies that, should sunspots be an indicator of an unknown influence upon the global climate of the earth, the north pole during winter is an excellent place to search for evidence.

    • Thanks, Mike. We’re not talking about auroras. We’re talking about temperature and for that we need some physical connection that can put a rather significant amount of energy into dark polar nights.

      Note that I didn’t say “not possible”. I said it wouldn’t be where I’d look by choice.

      w.

    • If Aurora can account for enough energy transfer, they could account for stratosphere temperature at the north pole in the dead of winter. This being the case, temperature distribution there during periods of high solar activity should reflect the ease with which electrons can reach the atmosphere at different locations.

    • “Aurorae = electrons from the sun = energy.”

      Please note – temperature is not a measure of energy (or heat).

        • Explain then how 75F air at near 100% RH in Louisiana has over twice the heat content per unit volume as 100F air in Phoenix at near 0% RH? (was blogged on WUWT a few days ago)

          • Hint to Steve Reddish – when you spew energy into water at 100C and 1-atm (like in a pan on your stove), you get steam at 100C with no temperature increase.

  9. Excellent example of an extreme unbeliever fighting the result he does not like. P-values and Bonferroni mumbo jumbo is his last refuge.

    • @unka
      Well, that was certainly a cogent reply…NOT. Did you intend to offer something constructive, Snowflake?

      • Willis does not know what he is talking about. You have N data (X,Y) and you can always calculate correlation. You can check some outliers or check the hypothesis whether the relationship is linear or not but beyond that there is no point of talking about p-values and and top of it bringing up Bonferroni is a manipulation by dishonest and/or ignorant person.

        • You spend all your effort attacking what you perceive to be Willis’ motivations without making any scientific point.

          If there is “no point in talking about p-values” maybe you should be criticising the authors, not Willis.

  10. @Willis

    The Bonferroni Correction is needed when you’ve looked in more than one place or looked more than one time for something unusual.

    I’m no expert on the Bonferroni Correction, just read about it here, but the Wikipedia description of it somewhat different than yours

    The Bonferroni correction is named after Italian mathematician Carlo Emilio Bonferroni for its use of Bonferroni inequalities.[1] Its development is often credited to Olive Jean Dunn, who described the procedure’s application to confidence intervals.[2][3]

    Statistical hypothesis testing is based on rejecting the null hypothesis if the likelihood of the observed data under the null hypotheses is low. If multiple hypotheses are tested, the chance of a rare event increases, and therefore, the likelihood of incorrectly rejecting a null hypothesis (i.e., making a Type I error) increases.[4]

    So it’s used for statistical testing, under _multiple hypotheses_, to prevent a false rejection of the null hypothesis.

    What are the multiple hypotheses that you are testing? And what is the null hypothesis that you are trying to reject?

    BTW, I tend to be skeptical of claims that solar activity has an effect on surface weather and climate, because there simply no compelling evidence of it. Yet. I’m not saying the effects are zero, but probably very small, commensurate with the tiny 0.1% variation in TSI.

    But I think in this case it may not be surprising to find a solar connection between sunspot counts and temperature, because this event is taking place in the stratosphere.

    At the poles, the tropopause is very low (about 10km), so this relationship observed at 22km, well into the stratosphere. Ultraviolet heating (ozone etc) is a very well known phenomenon there and may have some variability dependent on solar activity.

    And going higher, nn the thermosphere (> 80km), solar magnet activity definitely effects temperature. It is virtual proxy for sunspot activity.

    • Each instance of the investigation constitutes a new hypothesis. Hence, Willis’ observation that there are 504 different combinations. Or 72, to be generous. There is an xkcd cartoon that illustrates it nicely:

      https://xkcd.com/882/

      • “each instance of the investigation constitutes a new hypothesis”

        I see it more as a “multivariate” regression rather than “multihypothesis” test. How would you formulate this as a statistical test? What is the null hypothesis? Exactly what are the hypotheses being tested?

        As a regression is simple. The dependent variable is temperature and the the independent variables are QBO_polarity, sunspot_count, geopotential heights of effective layers etc.

        How does the existence of multiple variables destroy the temperature relationship being investigated? If a regression turns out to have predictive power/skill, who cares about it lack of Bonferronies?

    • Johanus,

      What are the multiple hypotheses that you are testing? And what is the null hypothesis that you are trying to reject?

      I will try to advance the likely responses by Willis. The multiple hypothesis would be that the Sunspots affect layer 1, or layer 2, or layer 3… o layer 7, in January, or in February, or in March… with this wind, or with this other wind, as he has already explained. Over 500 different hypothesis.
      The Null hypothesis would be that Sunspots do NOT affect significantly any ot the layers in any month with any wind.
      In order to reasonably assume causation from what they have (a mere correlation), you need a strong correlation, which they have. But that would be true if they had only looked for it in February, in that layer and with that wind. That’s not what they did. They looked at more than 500 combinations, and if you look at many places, you need a much stronger correlation, because merely a strong correlation of p<0.05 actually has a lot of chances of ocurring by pure luck in some of the places where you looked for it (more than 500 in this case). Rejecting the Null hypothesis means rejecting that the correlation can happen by pure chance. Therefore they have NOT rejected the Null hypothesis.

      • @Nylo
        “The Null hypothesis would be that Sunspots do NOT affect significantly any ot the layers in any month with any wind.”

        Thanks. This is indeed a complicated model with many layers and several additional “qualifications” of features. It even has a bit of “magic”: equatorial-polar telelcommunication (‘action at a distance’), where events at the equator affect events at the poles.

        So I think all of these dataset features just support a somewhat simpler null hypothesis: solar activity affects polar stratospheric temperatures.

        That is, we are trying to predict a temperature, given a set of qualified features, i.e. multivariate regression. It is also a search problem. Willis refers to it perjoratively with the term “data dredging”. I prefer to call it “data mining”. We are trying to optimize the qualified feature set to minimize the regression error.

        Simply rejecting the hypothesis, because it doesn’t have enough “Bonferronies”, is throwing the baby out with the bath water.

        There seems to be a very interesting correlation of solar activity to temperature, which I can see easily, but roughly, in Willis’ plot. The feature set needs to be refined to strengthen that correlation, and learn more about the underlying physics, which I suspect might be heating caused by scattering of far ultraviolet (i.e. the UV normally blocked by the troposphere).

        Yes, I’m waving my hands a bit here. But that’s how science often happens.

    • Well, the TCI seems to correlate well with SC24. And the physics make sense too, because high sunspot activity is correlated to enhanced EUV irradiance, which has no effect in the troposphere because it is all absorbed in the thermosphere.

      So, in the thermosphere, the enhanced EUV does indeed create hotter “weather”. But this heating has zero effect on the troposphere below.

      • Low sun activity and and a reduced hight of the thermosphere have an impact on the polar circulation with all consequeces we see and feel f.e. this winter in NH

    • If you remove the outliers at -50C, the remaining points seem to correlate much better to the trend line, such that we can infer lower temperatures at lower sunspot counts, higher temp at higher activity, etc.

      • There simply is no sunspot to temperature correlation. Use the Central England temperature record back to 1850, take the monthly mean to monthly mean sunspots and graph it. Total blob. With less data points, correlations can look better, but several hundred Central England plots show the reality. It is long past due to debunk the myth of global or local temperature records to sunspot count.

        • Donald, I totally agree, at the surface of the Earth.

          But in the thermosphere the correlation is remarkably high, because the air heats up as a result of EUV scatter.

          Does this kind of heating occur also in the upper stratosphere, maybe due to UV or UV-C? Maybe. Worth investigating.

          Does it affect tropospheric weather? Probably not, but I would shut the book on it.

        • Are not the “optics” in use now days (2019) for viewing/counting “sunspot’ numbers ….. far, far better than they were during the 100 years post-1850?

          And how much better is todays (2019) “optics” than in yesteryears when sunspots were first observed and recorded in China like 3,700 years ago, …… during the Shang Dynasty (~1700 BC to ~1027 BC?

          Does it make “logical” science sense to append the Historical Sunspot Numbers Records ….. to ….. the modern or Late 20th Century / Early 21st Century Sunspot Numbers Record?

          Sure it does, ….. as long as one doesn’t try to “squeeze” a few science fiction “truths” out of the composite Record.

          • It’s my understanding that the same telescopes that Wolf used are still in use today in order not to bias towards the modern era. Leif has posted here about that several times

  11. Heck, last night the “Science Channel” saying we got to bring woolly mammoths back so they can knock down all the trees in the arctic to save the perma-frost and, of course, the planet……..proof positive it’s getting too cool too fast……….

    https://corporate.discovery.com/discovery-newsroom/science-channel-goes-in-search-of-woolly-mammoths-and-other-creatures-frozen-in-time-in-lost-beasts-of-the-ice-age-premiering-sunday-february-24-at-8pm/

    Dr. Tori Herridge, Palaeontologist of the Natural History Museum, continued: “The quest to understand the extinction of so many large animals at the end of the last ice age – and whether humans, or climate change, or both, were responsible – has never felt so important in a world where wildlife is under increasing threat.”

    Professor George Church, Geneticist, Harvard Medical School, talking about his plan to use genetic engineering to bring mammoths back to Siberia, said: “The project really feels like it’s leaping forward. We didn’t expect so many high-quality specimens. It’s just very exciting.”

  12. p-value of 0.05
    =========
    No science or bûsiness should ever accept such a weak result.

    How much would you trust an airplane where 1 in 20 parts may be faulty.

    So why trust a scientific paper with such a high probability of error. Peer review doesn’t improve the p value.

    This may well explain the lack of progress in climate science. The foundation is riddled with false facts let in by a weak p value criteria.

    • “No science or bûsiness should ever accept such a weak result.”

      Thats funny.

      Car Raidators are know to fail in extremely hot weather.
      Suppose you run an an auto store and you want to know if you should increase your stock
      of Corvette radiators for the next two weeks.

      I tell you:

      1. Most years you consume 5 corevette raidators in these two weeks of the summer.
      2. Looking at a 14 day forecast, I ‘m 70% confident that number will double.

      Do you require 95% confidence to increase your inventory?

      nope.

      You should go out and try to make a living.

      • “1. Most years you consume 5 corevette raidators in these two weeks of the summer.
        2. Looking at a 14 day forecast, I ‘m 70% confident that number will double.

        Do you require 95% confidence to increase your inventory?”

        And then none blow for the rest of the year and you carry more than one in inventory for 52 weeks soaking up capital you need to run your business for the rest of the year. Sorry Steven not how inventory management works. The magic eight ball says try again later.

      • The number of unsupported assumptions in your missive is impressive, Mr. Mosher! You qualify for a career in CliSci, but not a real business.

        BTW, whatever happened to your foray into hustling Chinese bitcoin mining hardware?

      • Gotta go with Mr. Mosher on this. ANY individual time or place is one of an infinite number of times or places you could have looked. Using this bonfireoni standard would invalidate ANY correlation, ever.

        Take the correlation: the warmest part of the day is 4 in the afternoon. Not a perfect correlation, but good enough to validate some truth. By this bonfire standard the statistical significance is vacated because you didn’t check the moon, Jupiter, etc.

  13. “p-value < 0.05”
    =========/
    With thousands of climate researchers looking in hundreds of thousands of places for something unusual, and then only publishing where the results are positive, the Bonferroni Correction tells us that hundreds of published studies are likely false positives.

    Say for instance that only 1 in 20 studies on X get published. That means as many as 19 other people were also looking for X. And the p value published may not be significant at all. But we cannot tell because we don't know how many studies were not published.

    • Ferd

      Getting a p value of more than value X does not mean it is a false positive. It means it is a less confident “Yes”. There are no absolute “Yesses”. Every Yes is accompanied by an uncertainty.

      Choosing a conventional <1/20 chance of a false positive is just that: a convention. Nothing more.

      One can say that a result failed to meet a 99.99% confidence test, but it might meet a 99.9% confidence test. It is not " wrong" because it doesn't get a fourth nine. It is certain to one part in a thousand, but not to one part in ten thousand. So what?

      No one can demonstrate mathematically that they know the global average temperature to within one degree C with 99% confidence. The propagation of uncertainties doesn't permit that sort of claim. Yet people make claims to know it to 0.01 degrees with 95% confidence. Impossible. They know no such thing.

      It is over such issues that "climate experts" lose credibility. They literally don't know what they are talking about. They don't know enough to know that such a claim is impossible, based on the measurements available.

      • You have that right Crispin. Most disciplines have an established conventional significance level that is accepted by practitioners. This ranges from P<0.10 for some social sciences, to p<0.001 for some physical sciences. I worked in a small science faculty with 8 departments and +/- 50 faculty members. The chemists and physicists did most research under lab controlled conditions and demanded stringent p-values. Those of us who worked outside the lab (literally outside=the field) could only dream of collecting data/measurements with that kind of accuracy, and p<0.05 was standard. There was a period of time when convention called for reporting the actual probability rather than an arbitrary cutoff value. It is all about the nature of the beast. The "voted on" values in IPCC reports are a bad joke. "I think what we have is important, with very high confidence" has no place in science.

  14. A correlation coefficient of 0.28 is called random noise. Sunspots since 1850 to Central England monthly temperature plots as a blob of total noise. There is no sunspot to temperature correlation. There is somewhat of a pyramid as with lower sunspot count there is more variability, but this may be an artifact of few high sunspot months compared to low sunspot months.

      • Don’t agree. When negative results are not published this leaves a vacuum for false positives.

        Right now there is an epidemic of false positives in science because negative results aren’t published.

        • Ferd Berple: “Right now there is an epidemic of false positives in science because negative results aren’t published.”

          Not published even within the scientific community itself? That would be detrimental indeed. Or do you rather mean “false positives about science in qualified science journals and the popular media”?

        • Hmmm… still wondering… how can you be sure about something that isn’t “published”…

    • No the accepted interpretation of r^2=0.28 would be that the sunspot number accounts for 28% of the variation in temperature.

  15. Correction says you need to divide the desired p-value (0.05) by the number of places that you’ve looked.
    ≠=========
    With thousand of researchers looking in hundreds of thousands of places it could well be that none of the published findings in climate science are significant.

    • ferd berple
      You’ve touched on something that I was going to challenge Willis with. Assume that you have done sampling with very fine temporal and spatial resolution in the collection of data, and used all of your data. At first blush, it would appear that one has done a good job. Yet, as I understand Willis’ claim, all that data means dividing the necessary p-value by a very large number, which as it approaches infinity as a limit, means that the required p-value approaches zero. Detailed sampling seems to be counter productive in that the required p-value behaves like a mirage on the horizon, always receding as you approach it!

      There is another aspect to the conundrum. What if one’s budget was very tight, and the researcher could only afford to examine one altitude, and they got lucky and picked the altitude where a p-value was smaller than 5%. Should that be accepted as a correlation with statistical significance, even though the sampling was less than thorough? After all, the researcher only looked in one place so there is no requirement to reduce the p-value requirement.

      Something doesn’t seem right here.

      • You’re right, Clyde. Only looking at one or a few items leads to lack of understanding of the whole. One need look at the whole experiment/study to determine its validity.

      • Something doesn’t seem right here.
        =≠========
        You are confusing the number of data point with the number of times a study is performed.

        More data points in a study yields smaller statistical error in that study. More times a study is performed, the less significant each occurrence.

        The problem comes when a study is performed 100 times and you throw away the 99 that don’t fit the hypothesis and publish the one that does fit.

        The 99 you didn’t publish makes the one you did publish roughly 99 times less likely to be correct.

        But since the 99 that were never published remain hidden no one can tell that the one you did publish is likely garbage.

        And with the publishing bias against negative results it is possible and even likely that p values across science are a load of rubbish except the first time a study is performed.

        • It would be interesting to apply this to all the versions of code that a climate modeler goes through before settling on the version that he deems best.

        • ferd berple
          You said, “You are confusing the number of data point with the number of times a study is performed.” However, what Willis said was, “How many different places have they looked for a relationship? Well, they’ve chosen the temperature of one of twelve months, in one of seven atmospheric levels, with one of three sunspot lag possibilities (0, 1, or 2 months lag), and one of two equatorial stratospheric wind conditions.”

          Let’s take a situation where we sample data from a phenomena or location with no expectation of there being a correlation. Do we tighten the p-value requirement because we have looked at places or phenomena that do not plausibly have a a relationship, spurious correlations? This is beginning to sound a little like the Heisenberg Uncertainty Principle where the act of observation impacts the result.

  16. Willis
    Thanks for your mathematical explorations and explanation.
    PS Caution: Statistician WM Briggs strongly argues against using p values.
    Briggs, William M., 2019. Everything Wrong with P-Values Under One Roof. In Beyond Traditional Probabilistic Methods in Economics, V Kreinovich, NN Thach, ND Trung, DV Thanh (eds.), pp 22–44. DOI 978-3-030-04200-4_2
    Abstract

    P-values should not be used. They have no justification under frequentist theory; they are pure acts of will. Arguments justifying p-values are fallacious. P-values are not used to make all decisions about a model, where in some cases judgment overrules p-values. There is no justification for this in frequentist theory. Hypothesis testing cannot identify cause. Models based on p-values are almost never verified against reality. P-values are never unique. They cause models to appear more real than reality. They lead to magical or ritualized thinking. They do not allow the proper use of decision making. And when p-values seem to work, they do so because they serve a loose proxies for predictive probabilities, which are proposed as the replacement for p-values.

    Another Proof Against P-Value Reasoning

    Think of it this way: you begin by declaring “The null is true!”; therefore, it becomes almost impossible to move from that declaration to concluding it is false.

    • WM Briggs strongly argues against using p values
      =======
      I strongly agree because the number of times a study has been performed cannot be reliably known so you cannot apply the Bonferroni Correction and thus cannot trust the p value.

  17. Fewer sunspots would mean more cosmic rays, and with Svensmark’s theory, that means more cloud formation. I would expect clouds to have some effect on stratospheric temperature.

    So, though the statistics may not be good enough from Labitzke, I would not be surprised that with some more work, the theoretical relationship is borne out experimentally.

  18. “…the last place you’d expect to find a solar effect in February would be the North Polar region, where in February there is absolutely no sun at all … doesn’t make it impossible. Just less probable.”

    Willis I think this is a big mistake. You have assumed that the effect of sunspots on the atmosphere is by a mechanism that depends on light, in a region that is most strongly affected by the solar wind. No cookie.

    The activity of the sun, as reflected in the sunspot number, is magnetic and CR-based, not light-based. As you well know from HAM work, the effect on the E and F layers are affected by electromagnetic factors. That’s why when the sun was active I could hear someone in NYC working 20 meters with four Watts and an under-roof dipole from 3D6-land 5/3. And everyone else on the E coast was 20 over 9.

    To me, it is quite reasonable that the effect at the N Pole would happen one month after a light-free period of glorious Aurora Borealis. I would expect that more than in any other month.

    The best way to analyse this is to make a falsifiable prediction: look now for the correlation for Feb 2019 under the requisite conditions. If it is there, it is better than 99% of the GCM’s which accurately predict nothing.

    Just because the p value you have at the moment is low doesn’t mean the correlation is not valid. As you pointed out in the least piece, it just needs more data. Let”s be patient.

  19. Willis,

    I am not following you this time.

    …”For example. Suppose we throw three dice at once and all three of them come up showing fours … ”

    How many sides on each die? If they are 6-sided, then wouldn’t the odds of three “4”‘s coming up be 1/6 x 1/6 x 1/6 = 1 / 216? That is a bit far from 1/ 1,000; or am I not understanding something here? Are these 10-sided dice? If so then I get it.

    …”the last place you’d expect to find a solar effect in February would be the North Polar region”

    Unless the effect is from transport, i.e. wind. I think this was mentioned in one of the papers I read over the suggested mechanics behind solar-max warming by EUV. I am not agreeing, just relaying.

    Other than that, as usual, really enjoyed your posting. I always learn something from you! The plot had me laughing.

    • RofT,

      What you calculated was the probability. Odds would be 1:215. Not a lot of difference in this case, but sometimes a very significant difference.

    • Robert of Texas February 25, 2019 at 5:49 pm

      Willis,

      I am not following you this time.

      …”For example. Suppose we throw three dice at once and all three of them come up showing fours … ”

      How many sides on each die? If they are 6-sided, then wouldn’t the odds of three “4”‘s coming up be 1/6 x 1/6 x 1/6 = 1 / 216? That is a bit far from 1/ 1,000; or am I not understanding something here? Are these 10-sided dice? If so then I get it.

      Thanks, Robert. I said the odds were “about five in a thousand”. To three decimal places, (1/6)^3 = 1/216 = 0.005.

      …”the last place you’d expect to find a solar effect in February would be the North Polar region”

      Unless the effect is from transport, i.e. wind. I think this was mentioned in one of the papers I read over the suggested mechanics behind solar-max warming by EUV. I am not agreeing, just relaying.

      Possible. I’m just saying it is less probable.

      Other than that, as usual, really enjoyed your posting. I always learn something from you! The plot had me laughing.

      Thanks much, such reactions keep me going.

      w.

    • “Unless the effect is from transport, i.e. wind. I think this was mentioned in one of the papers I read over the suggested mechanics behind solar-max warming by EUV. I am not agreeing, just relaying.”

      Robert:
      The effect is that reduced solar UV reduces stratospheric warming over the tropics/subtropics and hence the deltaT between there and the Arctic (in darkness). This then (necessarily) reduces the strength of the stratospheric polar vortex. From there there can be tropospheric effects as a planetary wave can disrupt the (weaker) vortex and cause a down-welling effect that allows high pressure to dominate for a time over the Arctic, which will push cold air south and warm air north. Just a mixing of absorbed solar energy and no addition or subtraction. There was a disruption this year, but they do not occur every year, as many other factors have to be in place as well to favour their occurrence. (FI an E’ly QBO – it was a westerly this year, and that is thought to be why the Arctic plunge never made it to westernmost Europe despite being being predicted to).

    • No the way to look at the probability is, if the single trial probability of being true is p, the untrue probability is (1-p)=q.

      If you do two trials, the probability of not having a positive result is q^2 which is less than q.

      Thus the probability of not having a result after n trial is q^n and the probability of a true result is 1-q^n

      With dice, the probability of having 3 4s is (1/6)^3= 4.63E-4 =p

      q=1-p = 0.9954
      After 50 trials, q^50 = 0.79 so the probability of a 3 4s is =0.2

      and after 100 trials the probability is 37%.

  20. Willis,
    Your data link says the data is here, but that data doesn’t seem to fit the problem space:
    https://www.geo.fu-berlin.de/met/ag/strat/produkte/qbo/qbo.dat

    The link shown in your plot seems to be the correct one:
    https://www.geo.fu-berlin.de/en/met/ag/strat/produkte/northpole

    See my 25Feb03:32 comment above. I think you are misapplying the Bonferroni correction. What is the null hypothesis that you are trying to reject (by obtaining a lower p-value)?

  21. Really appreciate willis’s statistics knowledge (psych majors tend to be that way) and also his rational approach to the solar cycle. Thank you willis.

  22. Willis:

    Off topic, but could you enlighten us on the Russian Climate Model, which supposedly is the best of the lot. I have not seen any discussion on it, tho I might have missed it.

    Thanks.

    • … the Russian Climate Model, which supposedly is the best of the lot …

      Climate models are innately untestable for their relative accuracy, within a human lifetime, or even within several, so claims of relative ‘superiority’, of one over another equally untestable model, is frankly imaginary, and contemptible.

      A properly functioning accurate climate model should not vary almost at all, over an interval of centuries of precision WX observations (and BOM and NOAA regularly demonstrate that there’s no such thing as precision WX observational records …).

      And yet we are seeing ‘climate’ models veering away from a static trend, almost immediately, i.e. with mere decades of WX data (from mere overprinted WX cyclic noise … ridiculous!).

      Climate models are useless, and will be for many centuries, they don’t and can’t do what they claim. And even if they did work properly and with next to perfect accuracy of forecast trends over 1 millennia intervals, you’d never live long enough to be able to tell if it did, anyway!

      i.e. “… not even science …”.

      • CMIP5 models, used in the UN IPCC AR5 report, began their “projections” in (from?) 2005. Actual climatic metrics varied from those projected from then to now. Hell, they couldn’t even get the early (before 2005) 21st Century right.

        Depending on the metric, the models’ average temperatures ran “hot” by factors of 2 to 3. The variance in the bulk atmospheric temperature was particularly severe. The tropical tropospheric hot spot predicted by the green house gas theory, and reflected in the model projections, is conspicuously absent.

        I wonder how the CMIP6 models, to be used in the UN IPCC AR6 report, will be re-tuned to, at least, come close to actual surface temperatures CMIP5 got so wrong. I expect all the re-tuning in the world will not get the tropical tropospheric temperature profile correctly. This was and is a CAGW deal breaker. Watch for sleight of hand obfuscation in AR6.

  23. Willis
    I think this is an incorrect use of the Bonferroni correction.
    When Labitzke first looked for the relationship more than forty years ago some forty years ago it may indeed have been data dredging, i.e. lacking a specific hypothesis. In that case the relationship should have formed the basis for a hypothesis requiring subsequent verification, and the p-value has limited value.
    Given there is now another forty years (approx) of data, then these new data can be separately analysed. The p-value using just the new data (by itself) can then be taken at face value without further adjustment.
    However, I note than van Loon and Labitzke have many publications on this and related topics. Accordingly, they may well have had a relevant hypothesis prior to Labitzke’s 1987 paper, which would, in that case, give credence to the initial p-value.

    • Thanks, Keith. Finding a “significant” correlation in one month out of twelve, in 4% of the planet, at one of seven atmospheric levels, during one of two conditions of equatorial winds, is absolutely a data dredge. And that is NOT changed by simply waiting and adding data. That just makes it an enduring data dredge.

      Next, you seem to be ignoring the fact that if you adjust for long-term persistence (LTP), the finding is NOT significant no matter what Bonferroni says …

      Regards,

      w.

      • Willis, In this particular case I do not have the expertise to assess whether auto-correlation is an issue ( although I do often assess data for auto-correlation in situations where I do have the relevant scientific understanding). If the auto correlation is only within and not between sunspot cycles then there are likely to be enough degrees of freedom to to test and where necessary incorporate that in an analysis.
        But I cannot agree with you in relation to an enduring data dredge. The key issue in relation to data dredging and Bonferroni is that the new data set should be analysed separately from the initial data set and not just added to it. In that situation it is incorrect to do a Bonferroni adjustment.

        • Thanks, Keith. Seems like according to you all I need to do is find some obscure corner of the planet where sunspots correlate with something. Say, the temperature in one of twelve months up at 22,000 feet in the Arctic during the westerly phase of the QBO …

          Once I find that obscure corner, I wait for one year, and I analyze the new data from that one area, no need for Bonferroni, Bob’s yer uncle …

          In any case, near as I can tell, Labitzke added the new data and did NOT just analyze it separately.

          Best regards,

          w.

          • Willis, As I understand it – open to correction – there is now 40 years of new data since the hypothesis was developed. That would seem to provide enough degrees of freedom to work with, so as to either refute or confirm the relationship.
            I have no particular view as to the insights provided by Labitzke. My point is that in the same way that data dredging can lead to mis-use of p-values, the Bonferroni adjustment needs to be used with some caution and in ways that are appropriate.

          • Willis
            This is now taking on the flavor of a philosophical question. You said, “Once I find that obscure corner, I wait for one year, and I analyze the new data from that one area, no need for Bonferroni, Bob’s yer uncle …” I think that the crux of the problem is whether the correlation is a one time event, or whether it persists. As a practical matter, if the correlation persists, then one can indeed do a prediction. Then the problem becomes one of finding out why “that obscure corner” demonstrates a correlation with known physical phenomena, and why other areas do not.

        • Keith’s seems to be a legitimate argument, IMO. So in the original study they only determined their hypothesis (though observed correlations) and then were able to later evaluate the hypothesis with out of sample data (if that’s what actually occurred?). Albeit, the ‘hypothesis’ tested in the later (legitimate) test has very limited explanatory power (except just describing that particular relationship at that location, altitude, time of year, etc.) That seems to be real science.

  24. The Russian model way be accurate simply because one model has to be the most accurate simply due to chance. Sort of like a stopped clock being accurate twice a day.

    The problem with forecasting the future by dividing time into slices is that the error from each slice does not average to zero over time. It drifts, with no known means to correct.

    Climate science and the ipcc believes that over time the law of averages will average out the error to zero. Thus the multi model mean.

    The reader is left to research the law of averages (hint: it is fake).

    • It is actually far worse than that in that the temperature is being used to proxy earth energy balance. So if the energy imbalance is building up or diminishing where your proxy does not measure you are in for a bit of a surprise. There are plenty of places on Earth and it’s atmosphere that aren’t 1m off the ground for a ground site station to measure.

  25. I found the Bonferroni correction to be interesting so I did some reading and found testing of the concept. What has been found is if the NULL hypothesis is known to be TRUE then the correction greatly increases the likelihood of accepting this NULL hypothesis. (That’s good)
    However, if the NULL hypothesis is known to be FALSE, the correction will significantly increase the chances of finding false negatives, ie wrongly accepting the NULL hypothesis. (That’s bad)
    So you have to ask yourself, which is better, false positives or false negatives. And this most likely depends on the problem you are considering.

  26. The modern scientific method suggests that after throwing the dice, the results of the raw data should be ajusted to match what 97% of a specially selected group of 87 scientists had predetermined with their models. I

  27. Proof that Willis is wrong.

    a) Analysis of 30 hPa winter North Pole temperature segregated by solar activity, and quasi-biennial oscillation phase.
    b) 30 hPa winter North Pole temperature versus tropical 30 hPa wind speed for low solar activity years.
    https://i.imgur.com/qvLkx6Y.png

    30 hPa geopotential height for high solar activity years and low solar activity years. The difference is a whopping 240 meters. A quarter of a kilometer. Indicating profound differences in the density and temperature of the air column below 30 hPa all the way to the surface.
    https://i.imgur.com/0fSLxDf.png

    You never find something when you don’t want to find it.

    Here’s the idea that made me wonder—if you were going to look for some solar-related effect in February, where is the last place on Earth you’d expect to find it?

    Yep, you’re right … the last place you’d expect to find a solar effect in February would be the North Polar region, where in February there is absolutely no sun at all

    Your statistics knowledge is better than most people’s, but your way of thinking is too simplistic. You have an idea on how the Sun should affect climate if it did, that implies temperature-sunspots correlation and TSI effect on the surface, so you think that in the dark pole, with the Sun not shining, solar variability cannot have an effect. Since the Sun does not conform to your narrow view in its effect on Earth’s climate you are the person that could never find a solar effect on climate even if the solution was just in front of your nose.

    In my articles and comments I have already presented all the pieces of the puzzle about how solar activity affects (and at some times controls) global temperature. Curiously you have reached the opposite position to the solution, although I shouldn’t have expected less of you.

    • The hypothesis is that low solar activity results in very different 30 hPa conditions (geopotential height and temperature) over the winter North Pole depending on 30 hPa equatorial wind speed, and that the result is different when solar activity is high. The null hypothesis is that there is no difference that can be attributed to differences in solar activity. Bonferroni does not apply. Labitzke doesn’t need to meet him.

      People in North America are already having an experience of what Labitzke discovered. It is called the Polar Vortex. The Polar Vortex disorganization that results in very cold polar air intrusions depends on solar activity, quasi-biennial oscillation, and ENSO state. It is also affected by volcanic eruptions with a high stratospheric sulfate injection in the Northern Hemisphere.

      Solar variability does affect the climate depending on its deviation from average and time. Long periods of very low or very high solar activity have profound effects on climate becoming the most important factor on centennial timescales. The LIA was no joke.

    • Exactly Javier and explained very well. What Willis did was restate the hypothesis so he could refute it with a simple (too simple, really) statistic. Once the hypothesis is correctly stated and examined, the significance is obvious. There are lies, damn lies and statistics.

      • Agree.
        That is exactly Kant’s trick with a Critique of Pure Reason (knock something which does not exist) . Which is a version of Newton’s hypothesis non-fingo while sticking with absolute space – no wonder Kant defended Newton’s sleight of hand!
        This is exactly the Adam Smith Invisible Hand trick in economics.
        Kant , called the omni-destructeur by Heine , is second to Adam Smith the destroyer of national economies.

        What a parcel of rogues tossing dice!

          • Quite a bit as CEO of the British East Indie Company running opium to China. The invisible hand trick turns up again in von Hayek’s “spontaeous but unknowable” mumbo-jumbo. All borrowed from Mandeville’s feudal economics. Marx used Smith as a straw dog to hit the American System of Matthew Carey, recently referred to by Pres. Trump.
            China is not impressed with it’s old opium lords’ “free trade” – the win-win BRI economics is actually based on the American System, and the world is in awe of its success.

          • bonbon, your suck and jive snark dance didn’t answer the question. Compare Marx’s contribution to world wealth vs Smith’s.

      • Andy

        I have another take on what the error is:

        Apart from the error of assuming that darkness means solar effects are going to be missing, there is the concept of running an experiment under controlled conditions.

        In many cases, seeking the isolation of a mechanism requires very particular conditions for it to be demonstrated. Under certain identified conditions, there has been demonstrated a correlation between solar activity (using sunspots as a proxy) and a particular effect in the atmosphere. The atmosphere is very complex and it is possible that only under certain conditions can the effect/correlation be clearly identified.

        Just because some mechanism is only isolated under certain conditions doesn’t mean the mechanism is not present. The counterfactual is that the cause of the effect is not solar activity. So it is, or it is not. That is the question, not whether the effect has been identified within an arbitrary statistical value range.

        We can make claims to have established the mechanism, or we can claim to have established a correlation, or we can claim there is not enough reasonable proof, but we can’t easily say the correlation is accidental and does not represent any real mechanism, if it is predictable.

        If the effect is predictable, then it is reasonable to accept the cause is solar activity. I could perform a hundred different experiments trying to find the breeding grounds of Great White Sharks and find that only one of them produced a correlation between a physical location and a lot of breeding Great White’s. Because most of the experiments failed doesn’t mean the single example was wrong or uncertain.

        The theory that the moon pulls the oceans and makes them go up and down is great at explaining why there are two tides per day, except for those places where there is only one. We observe two tides per day only when the conditions are right. Extrapolate. There is a correlation, the conditions have been established when this effect can be observed, and the objection that it cannot be the sun because it is dark (i.e. that all solar effects on the earth have to be created by visible light) is unreasonable.

        If it can make useful predictions, then it is a useful hypothesis. With time, other sets of contributing factors may predict other effects or combinations thereof.

    • Proof that Willis is wrong.

      remember the golden rule about quoting whatever it is you disagree with ?

      In fact what you show does a better job than the paper being discussed.

  28. “Yep, you’re right … the last place you’d expect to find a solar effect in February would be the North Polar region, where in February there is absolutely no sun at all ”

    Depends. If the effect is due to the charged particles from the solar wind, or solar flares, the position of the Earth’s axis is more or less irrelevant. These particles hit the Earth’s magnetosphere and get funneled to the poles, where they cause the aurora, among other things.

    If the claimed effect can be traced to particle-induced nucleation, i.e. Svensmark, than it might very well show up at the pole, even during winter.

    • ‘Now that Cluster and THEMIS have directly sampled FTEs, theorists can use those measurements to simulate FTEs in their computers and predict how they might behave. Space physicist Jimmy Raeder of the University of New Hampshire presented one such simulation at the Workshop. He told his colleagues that the cylindrical portals tend to form above Earth’s equator and then roll over Earth’s winter pole. In December, FTEs roll over the north pole; in July they roll over the south pole.’

      https://science.nasa.gov/science-news/science-at-nasa/2008/30oct_ftes

  29. Sudden Stratospheric Warming events are more common during a warm AMO phase, and during El Nino years. Which is all about when the solar wind is weaker. Picking years near sunspot cycle maximum as a measure of higher solar activity as in the chart that Javier posted 2 days ago on the Solar Signature post won’t work as many of those years had weaker solar wind states.

    https://i.imgur.com/0fSLxDf.png

    Solar plasma data
    https://snag.gy/d2v3aJ.jpg

    Sudden Stratospheric Warming events are well connected to the tropical stratosphere where strong cooling occurs during a SSW event.
    https://www.cpc.ncep.noaa.gov/products/stratosphere/temperature/05mb2525.png

  30. sunspot-related variations in solar output were shown by Labitzke et al. to affect the stratospheric temperature over the North Pole, viz:

    Even if so, the stratosphere has almost no mass and heat-content. It does not affect the climate, the atmosphere from below affects IT. The tail does not wag the dog.

    • “There are three stages of scientific discovery: first people deny it is true; then they deny it is important; finally they credit the wrong person.”
      Alexander von Humboldt, as cited by Bill Bryson in “A short history of nearly everything” (2003).

  31. Willis, You really like to talk about sunspots vs temperature.

    You appear to have never looked at the research concerning how the sun modulates planetary cloud cover.

    Sun spot count (closed magnetic flux on the surface of the sun) is a rough measure of one of two solar phenomena that create solar wind bursts. The solar wind bursts remove cloud forming ions from the high latitude regions and the tropics changing the amount of cloud cover and the properties of the clouds, by creating a space charge differential in the ionosphere. The process where solar wind bursts remove and add ions to the clouds is called electroscavenging.

    Electroscavenging is what amplifies or inhibits El Niño events. Coronal holes, open magnetic flux regions on the sun, also cause solar wind bursts. What causes coronal holes to form is not known.

    P.S. For some unexplained reason the sun is now (William: The solar coronal holes are still there) covered with coronal holes. The coronal holes of course cause solar wind bursts which explains why the planet has not cooled due to the increase in GCR which.

    Coronal holes can persist for months and have for some unknown reason occurred late in the solar cycle in low latitude regions thereby causing solar wind bursts to occur when there are few sun spots on the surface of the sun or no sunspots. Coronal holes make it appear that the solar magnetic cycle is not the primary modulator of the earth’s.

    The solar magnetic cycle modulates the amount of high speed particles (called cosmic ray flux (CRF) or galactic cosmic rays (GCR) for historical reasons, the discoverers thought the phenomena was caused by a ray rather than a particle and the misleading name stuck) that strike the earth’s atmosphere creating cloud forming ions by changes to the solar heliosphere.

    The following is a review paper that discusses some of the mechanisms by which solar changes modulate planetary climate.

    http://www.albany.edu/~yfq/papers/TinsleyYuAGU_Monograph.pdf

    http://sait.oat.ts.astro.it/MmSAI/76/PDF/969.pdf

    Once again about global warming and solar activity
    Solar activity, together with human activity, is considered a possible factor for the global warming observed in the last century. However, in the last decades solar activity has remained more or less constant while surface air temperature has continued to increase, which is interpreted as an evidence that in this period human activity is the main factor for global warming. We show that the index commonly used for quantifying long-term changes in solar activity, the sunspot number, accounts for only one part of solar activity (William: Closed magnetic field) and using this index leads to the underestimation of the role of solar activity in the global warming in the recent decades. A more suitable index is the geomagnetic activity (William: Short term abrupt changes to the geomagnetic field caused by solar wind bursts, which are measured by the short term geomagnetic field change parameter Ak. Note the parameter is Ak rather than the month average with Leif provides a graph for. The effect is determined by the number of short term wind bursts. A single very large event has less affect than a number of events. As Coronal holes can persist for months and years and as the solar wind burst affect lasts for roughly week, a coronal hole has a significant effect on planetary temperature) which reflects all solar activity, and it is highly correlated to global temperature variations in the whole period for which we have data. ….

    ..The geomagnetic activity reflects the impact of solar activity originating from both closed and open magnetic field regions, so it is a better indicator of solar activity than the sunspot number which is related to only closed magnetic field regions. It has been noted that in the last century the correlation between sunspot number and geomagnetic activity has been steadily decreasing from – 0.76 in the period 1868- 1890, to 0.35 in the period 1960-1982, while the lag has increased from 0 to 3 years (Vieira
    et al. 2001).

    ..In Figure 6 the long-term variations in global temperature are compared to the long-term variations in geomagnetic activity as expressed by the ak-index (Nevanlinna and Kataja 2003). The correlation between the two quantities is 0.85 with p<0.01 for the whole period studied. It could therefore be concluded that both the decreasing correlation between sunspot number and geomagnetic activity, and the deviation of the global temperature long-term trend from solar activity as expressed by sunspot index are due to the increased number of high-speed streams of solar wind on the declining phase and in the minimum of sunspot cycle in the last decades.

    • William Astley February 26, 2019 at 7:09 am

      Willis, You really like to talk about sunspots vs temperature.

      You appear to have never looked at the research concerning how the sun modulates planetary cloud cover.

      Questions generally beat assertions. For example, you might ask “Willis, have you looked at the research concerning how the sun modulates planetary cloud cover?”

      Then I could say “Yep! See my posts entitled Splicing Clouds” and Clouds Down Under.

      Or you could simply assert that I haven’t looked at cloud cover … and be totally wrong.

      Your choice …

      w.

  32. “That gives 504 different combinations. Heck, even if we leave out the seven levels, that’s still 72 different combinations. So at a very conservative estimate, we’d need to find something with a p-value of 0.05 divided by 72, which is 0.0007 … and the p-value of her finding is about three times that. Not significant.”

    Willis,

    I believe with N measurements, the uncertainty we look for (your p value) is reduced by square-root N.

    Thus sqrt (504) = 22.45 and sqrt (72) = 8.49.

    So p = 0.05/22.45 = 0.0022 and p = 0.05/8.49 = .0059.

    The chart above shows their finding … which is that if you look at the temperature in February, at one of seven different possible sampled levels of the stratosphere, over the North Pole, compared to the January sunspots lagged by one month, during the approximately half of the time when the equatorial stratospheric winds are going west rather than east, the p-value is 0.002.

    p-value = 0.002 < 0.0022 and < 0.0059.

    Significant.

    • Joel, it’s not clear where you are getting your ideas about square roots.

      Suppose we look in three places, and the desired p-value is < 0.05. What are the odds that we find it by chance alone? If the "pval" is the desired pvalue and "ntries" is the number of places we look, the formula is: 1 - (1 - pval) ^ ntries = 1 - (0.95)*3 = 0.14 Now, suppose we want that final number to be less than 0.05 ... how small must pval_original be to give that answer? If "fval" is the final number, then fval = 0.05 = 1 - (1-pval_original)^ntries Solving for pval_original gives us pval_original = 1 - (1 - fval)^(1/ntries) And that is the 100% accurate formula. It is also the formula that I derived myself and used for many years ... until someone said the magic words, "Bonferroni Correction". Now, what Bonferroni realized and I didn't was that, with a vanishingly small error, pval_original = 1 - (1 - fval)^(1/ntries) ≈ fval/ntries With a pval_original of 0.05, the maximum error using the Bonferroni approximation is 0.0003 ... So if we're looking to get a final value of 0.05 with multiple tries, we need to find something with a p-value of 0.05 / ntries. w.

  33. Yep, you’re right … the last place you’d expect to find a solar effect in February would be the North Polar region, where in February there is absolutely no sun at all … doesn’t make it impossible. Just less probable.

    Ions and negatively charged particles from the sun still get ‘sucked in’ by the earth’s magnetic field in winter, don’t they ?

    • No, there is a possible rational mechanism besides direct sunlight.

      February is the coldest month in the Arctic. Sunspots follow the solar cycle. The solar wind speeds and solar wind particle flux (electrons) follow the solar cycle. This would have to be a solar cycle related geomagnetosphere disturbance cycle and particle injection cycle at the magnetic poles related phenomenon at that altitude (22 km).

  34. Willis’ analysis of sunspot vs sunspots is not useful (it does not prove or disprove anything) as Sunspots do not modulate the earth’s climate, solar cycle changes do.

    If a person does not understand the mechanisms plotting data is just a sciency looking game.

    If you look at my comment above it includes a link to an excellent review paper by Tinsley that explains the mechanism by which solar cycle changes modulate planetary cloud cover.

    The paper below by Svensmark ‘The Antarctic climate anomaly and galactic cosmic rays’ is one of the best proofs that solar cycle changes modulate planetary temperature.

    The AGW paradigm pushers have no explanation as to what physically cause the effect and have conveniently ignored the fact that the paleoclimatic data shows the Arctic cyclically warms and cools.

    Interesting the Antarctic also warms and cools, however, the Antarctic warming and cooling is out of phase with the Arctic warming and cooling.

    The out of phase warming and cooling of the two poles is called by the specialists ‘the polar see-saw’. (i.e. Exactly what we are observing now, a polar see-saw. The Antarctic ice sheet cooled and the Greenland Ice sheet warmed due to the reduction in cloud cover.)

    The 20th century warming follows the same pattern as previous warming (see paper below that discuss this cyclic warming which in every case is followed by cooling). These changes in planetary temperature correlates with a significant increase in solar magnetic cycle activity and also correlate with significant changes to the geomagnetic field strength and geomagnetic field orientation.

    The geomagnetic field changes strongly effects GCR that strikes the earth particular in high latitude regions.

    There was also a significant increase in solar magnetic cycle activity during the other polar see-saws.

    The Antarctic ice sheet is out of phase with Dansgaard-Oeschger cycles and hence cools when the other high latitude regions warms. Svensmark explains the mechanism in the attached paper.

    The albedo of the Antarctic ice sheet is greater than that of clouds. That fact and the fact that the Antarctic ice sheet is isolated from the surrounding Southern Ocean by the polar vortex explains the phenomenon.

    The Southern Ocean warms and cools in phase in with the D-O cycles.

    The Greenland ice sheet (the Greenland Ice sheet is not isolated by a polar vortex) also warms and cools in phase with D-O cycles.

    The out of phase temperature changes comparing the Greenland Ice sheet to the Antarctic Ice sheet is called the Polar See-Saw.

    http://arxiv.org/abs/physics/0612145v1

    The Antarctic climate anomaly and galactic cosmic rays

    Borehole temperatures in the ice sheets spanning the past 6000 years show Antarctica repeatedly warming when Greenland cooled, and vice versa (Fig. 1) [13, 14]. North-south oscillations of greater amplitude associated with Dansgaard-Oeschger events are evident in oxygenisotope data from the Wurm-Wisconsin glaciation[15]. The phenomenon has been called the polar see-saw[15, 16], but that implies a north-south symmetry that is absent. Greenland is better coupled to global temperatures than Antarctica is, and the fulcrum of the temperature swings is near the Antarctic Circle. A more apt term for the effect is the Antarctic climate anomaly.

    Attempts to account for it have included the hypothesis of a south-flowing warm ocean current crossing the Equator[17] with a built-in time lag supposedly intended to match paleoclimatic data. That there is no significant delay in the Antarctic climate anomaly is already apparent at the high-frequency end of Fig. (1). While mechanisms involving ocean currents might help to intensify or reverse the effects of climate changes, they are too slow to explain the almost instantaneous operation of the Antarctic climate anomaly.

    Figure (2a) also shows that the polar warming effect of clouds is not symmetrical, being most pronounced beyond 75◦S. In the Arctic it does no more than offset the cooling effect, despite the fact that the Arctic is much cloudier than the Antarctic (Fig. (2b)). The main reason for the difference seems to be the exceptionally high albedo of Antarctica in the absence of clouds.

    It is fact that earth’s climate changes cyclically.

    http://www.agu.org/pubs/crossref/2003/2003GL017115.shtml

    Timing of abrupt climate change: A precise clock by Stefan Rahmstorf

    Many paleoclimatic data reveal a approx. 1,500 year cyclicity of unknown origin. A crucial question is how stable and regular this cycle is. An analysis of the GISP2 ice core record from Greenland reveals that abrupt climate events appear to be paced by a 1,470-year cycle with a period that is probably stable to within a few percent; with 95% confidence the period is maintained to better than 12% over at least 23 cycles. This highly precise clock points to an origin outside the Earth system; oscillatory modes within the Earth system can be expected to be far more irregular in period.

    The following is a link to Bond’s paper “Persistent Solar influence on the North Atlantic Climate during the Holocene”

    http://www.essc.psu.edu/essc_web/seminars/spring2006/Mar1/Bond%20et%20al%202001.pdf

    Excerpt from the above linked paper:

    “A solar influence on climate of the magnitude and consistency implied by our evidence could not have been confined to the North Atlantic. Indeed, previous studies have tied increases in the C14 in tree rings, and hence reduced solar irradiance, to Holocene glacial advances in Scandinavia, expansions of the Holocene Polar Atmosphere circulation in Greenland; and abrupt cooling in the Netherlands about 2700 years ago…Well dated, high resolution measurements of O18 in stalagmite from Oman document five periods of reduced rainfall centered at times of strong solar minima at 6300, 7400, 8300, 9000, and 9500 years ago.”

    • So since the mid 1990’s, solar wind pressure has declined, and so has low cloud cover, and the Arctic has warmed.

  35. [Before I comment on Willis lack of attire, I will quote the words he has posted that I am using as the basis for my criticism:

    “Thanks, Keith. Seems like according to you all I need to do is find some obscure corner of the planet where sunspots correlate with something. Say, the temperature in one of twelve months up at 22,000 feet in the Arctic during the westerly phase of the QBO …”] – Posted above on February 25th 10:18 p.m.

    My Response:

    Someone has to step in and tell Willis that he might have a shirt on, but he has forgotten to put on his pants.

    He has his shirt on because he is 100 % correct in pointing out that the Bonferino and Long-Term Latency effects must be a factored into assessing the appropriate “p” value for a correlation to be significant. Willis needs to be thanked for pointing this out to the skeptic community.

    However, Willis is not wearing his pants because he overlooks some basic concepts that can be used by scientists to conduct a valid (or fair) experiment.

    In order to understand why I am making this claim, I need to refer back to comments that Willis made about one of my comments in an earlier WUWT by Anthony Watts on the 13th of February:

    https://wattsupwiththat.com/2019/02/13/new-paper-attempts-to-link-solar-cycles-and-streamflows/

    See my post at 8:10 p.m. and Willis’ response at 11:48 p.m. on the 14th of February.

    I made a claim that there was spectral evidence to prove that the summer-time maximum median temperatures in Adelaide between 1888 and 2013 showed evidence of the influence of the 22-year solar magnetic cycle.

    Willis claimed that I was guilty of arbitrarily cherry-picking, measurement methods, places and times and that this made the observed strong spectral evidence worthless. It is true that my spectral evidence was not a correlation and that I was not claiming a “p” value, however, the arguments I give below apply equally to anyone who tries to cherry-pick their use the Bonferino principle to arbitrarily dismiss an author’s claimed correlation between the solar cycle and meteorological indices.

    In his reply Willis said:

    [Note that the reply by Willis to me in this earlier WUWT post is similar to the quoted reply he has given to Keith that I posted at the beginning of this comment.]

    “So out of the thousands of temperature stations around the planet, you pick one of them [Adelaide, S.A.].
    That’s not good enough, however, so you pick one season out of one of them.
    Still not good enough, so you pick just the max temperature, not the min.
    Then you say that means don’t work for your method, you need to used medians.
    Then you remove a “slight long-term smooth parabolic variation” by subtracting the PC1 and the PC5 from what you got from all of the previous cherry picking.
    And finally, you use Singular-Spectrum-Analysis on the results …”

    Let’s address each one of these claims by Willis and see why they are an inappropriate application of the Bonferino principle.

    1. “That’s not good enough, however, so you pick one season out of one of them.”

    The sub-tropical high-pressure ridge (STHPR) over Australia moves north and south with the seasons. In the winter-time, its peak is located about 30 degrees south of the Equator and in the summer-time, it can be located as far south as 40 – 45 degrees.

    In 2012, I published a paper which showed that the summer-time [DJF] latitude anomaly of the peak of the sub-tropical high-pressure ridge (STHPR) over Australia showed 9.3-year and 3.8-year variations that were in-phase with the Lunar tidal cycles.

    [Note that the slow solar seasonal movement of the STHPR to the north and south is a nuisance if you want to look for the possible effects of the Moon on the latitude of the of the peak of the STHPR, so you need to control for this movement by picking a specific season. Hence, I chose to use the summer months [DJF] to measure the southernmost latitude of the high-pressure ridge (i.e. latitude anomaly of the peak].]

    After publishing this paper, I then became interested in heat waves that affected the Southern Australian coast. These summer-time heat waves regularly affected the cities of Melbourne and Adelaide and I speculated that their frequency might be affected by how far south the high pressure-cells in the south-eastern corner of Australia. My interest was peaked because I thought that the heat-waves might be caused by the lunar influence upon the southernmost-latitude of the STHPR. The fact that my 2012 paper had found the lunar influence was detectable in the summer-time [DJF] data was good because it also corresponded to the time when the heat waves over SE Australia were prevalent as well.

    When the high-pressure cells move off the south-east coast of Australia, they direct northerly winds off the Australian deserts down towards the cities of Melbourne and Adelaide causing heat waves.

    2. “So out of the thousands of temperature stations around the planet, you pick one of them [Adelaide, S.A.].”

    Yes, I chose Adelaide because it had a long temperature record [at the time from 1888 to 2013] and its heat waves were more pronounced than those of Melbourne.

    3. “Still not good enough, so you pick just the max temperature, not the min.”

    I think if you are studying heat waves it seems to be appropriate to select the maximums instead of the minimums.

    4. “Then you say that means don’t work for your method, you need to used medians.”

    The day-to-day summertime maximum temperatures of the city of Adelaide vary by up to 10 C in a matter of a few days. This means that the median is a much better representation of the central tendency than the mean since the latter it is far more sensitive to outliers.

    5. “Then you remove a “slight long-term smooth parabolic variation” by subtracting the PC1 and the PC5 from what you got from all of the previous cherry pickings.”

    You can actually see the effect of the removal of the P1 and P5 principal modes upon the final spectrum and it only has a significant effect upon the spectral power for periods longer than about 11-15 years. So, no, this step does not affect the final conclusions.

    Willis said: “And finally, you use Singular-Spectrum-Analysis on the results …”

    The SSA is done on the raw time series without any smoothing. As far as I can tell, the SSA technique is a valid method for analyzing time series.

    Now after all of this, what did I find? Remember, I was looking for a lunar effect upon the timing of heat-waves. What I found is a pronounced solar effect which I wasn’t even looking for!!!!

    Hence, all I am claiming is that if the Sun’s magnetic cycle can have an influence upon a known regional meteorological metric (i.e. the strength and vorticity of high-pressure cells in the Southern hemisphere) then it is possible that it could have an influence further afield.

    Hence, I do not believe that Willis is correct that picking places (regions), times (seasons) and methods (SSA) are totally arbitrary. There often valid reasons for making specific picks which preclude the use of the Boniferro effect simply because the picks are based on scientifically justifiable reasons e.g.

    1) studying heat-waves and choosing maximum temperatures instead of minimum,
    2) choosing a particular location on the Earth because it is a good example of a place that is affected by a regional meteorological metric that spans over thousands of kilometers. etc. etc.

  36. Why hasn’t the planet cooled? The solar cycle is weak.

    Due to the sudden appearance of coronal holes on the surface of the sun. The coronal holes create solar wind bursts which create a space charge differential in the ionosphere which in turn removes cloud forming ions.

    The sun is currently spotless and there is a super large coronal hole CH909 pointing at the earth. As explained below the wind bursts from coronal holes ‘override’ the solar GCR effect on clouds.

    http://www.solen.info/solar/

    Solar wind bursts cause the planet to warm by creating a space charge differential in the ionosphere which in turn causes an electric current flow from high latitude regions of the planet to the equator. The return path for the current is in the ocean. This process is called electroscavenging.

    Sunspots and coronal holes both affect the strength and extent of the solar heliosphere which is the name for the tenuous gas and pieces of the magnetic field that are thrown off the sun. The heliosphere extends well past the orbit of Pluto.

    The solar heliosphere block GCR (galactic cosmic rays, mostly high speed protons). So when the solar heliosphere is strong the pieces of magnetic field in the solar heliosphere block GCR so there are less GCR striking the earth.

    The increased GCR will cause the planet to cool at high latitude regions, if there are not solar wind bursts to remove the cloud forming ions. GCR will only cause the planet to cool at high latitude regions as the earth’s magnetic field in lower latitudes blocks the GCR.

    Note the difference in the regions of the planet that are affected by solar wind bursts and solar heliosphere’s modulation of the amount of GCR that strikes the earth. Electroscavenging both high latitude regions and the equator while solar heliosphere only high latitude regions. This comment is true as long as the geomagnetic field is not strongly tilted or is in an excursion.

    http://www.klimarealistene.com/web-content/Bibliografi/Tinsley2007,GlobalElectricCircuit.pdf

    Atmospheric Ionization and Clouds as Links between Solar Activity and Climate, By Brian Tinsley and Fangqun Yu

    The GCR flux is responsible for almost all of the production of ionization below 15 km altitude, that determines the conductivity in that region. The MeV electrons and their associated X-rays produce ionization in the stratosphere, and affect the conductivity there. The current flow in the global electric circuit is generated mainly by charge separation in deep convective clouds in the tropics, and maintains the global ionosphere at a potential of about 250 kV (250,000 volts). Variations above and below this value occur in the high latitude regions due to solar wind – magnetosphere – ionosphere coupling processes. The current density Jz varies horizontally due to variations in the local vertical column resistance (this is affected by the GCR and MeV electron fluxes) and by variations in the local ionospheric potential (especially to those in the high latitude regions). Because Jz flowing through clouds in the troposphere responds to conductivity and potential changes occurring all the way up to 120 km altitude, it is a very effective coupling agent for linking inputs in the stratosphere and ionosphere with cloud levels.

    • “Why hasn’t the planet cooled? The solar cycle is weak.”

      Because weaker solar wind states drives warm ocean phases via the annular modes. A warm AMO/Arctic and increased El Nino conditions is normal during a solar minimum. The warm ocean phases drive an increase in low-mid troposphere water vapour, and a decline in low cloud cover. Negative feedbacks all the way, including increased ocean heat uptake with the reduced low cloud clover.

  37. To solar cycle fans

    Javier posted this paper previously:
    Solar forcing of semi-annual variation of length of day
    https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2010GL043185

    It’s an interesting paper but the authors do not seem to know the physical mechanism behind how the 11-year solar cycle affects the length of day (LOD). My X-Magneto theory predicts the timing and variation in amplitude of LOD. I can calculate this variation using my Thermomagneto Equations without searching for correlations, p-values, sunspot number, Fibonacci numbers, etc.

    Thermomagneto Equations:
    X = r arctan (B/Bc sin w)
    F = π r^2 P
    q = X u F a N

    Where: X is magnetic moment displacement; r is atomic radius of Fe; B is solar magnetic field; Bc is geomagnetic field at Earth’s core; w is angular velocity of solar magnetic field; F is gravitational force per atom; P is pressure at core; q is friction heat rate; u is coefficient of dynamic friction; a is portion of atoms susceptible to paramagnetism; N is number of Fe atoms in core

    The solar magnetic field and geomagnetic field are vectors. From vector addition, the magnetic field vector is minimal at 0 and 180 deg. angles and maximal at 90 deg. The 11-yr cycle is equal to 180 deg. (reversal of magnetic field). From this, I deduce the peak amplitude of LOD will occur at halfway the cycle, which corresponds to 90 deg. or at 5.5 years from the minimum amplitude. If you look at Figure 1 in the paper, you will see it follows the 5.5-yr interval in maximum and minimum amplitude. Hence, my theory correctly predicts the timing of amplitude.

    Next I will calculate the variation in amplitude. X-magneto theory states that geothermal heat is produced by friction heating at Earth’s core due to the rotation of solar magnetic field. From my Thermomagneto Equations, I deduce this relationship:
    w = dq/q = dT/T
    It means the angular velocity of solar magnetic field (w) is proportional to the change in friction heat rate (dq) and the change in core temperature (dT). Solving for dT in the above equation:
    dT = w T
    w = 9.0 E-9 rad/s, T = 4500 K = temperature of outer core
    dT = 4.1 E-5 K

    Next I calculate the linear thermal expansion of the core.
    a = dR/(R dT)
    where: a is coefficient of thermal expansion of Fe; dR is change in core radius; R is outer core radius
    Solving for dR in the above equation:
    dR = a R dT
    a = 10^-5, R = 3480 km
    dR = 1.4 E-3 m

    Next I apply the conservation of angular momentum. Everybody’s favorite example is the spinning ice skater.
    L = m r^2 w = k
    Where: L is angular momentum; m is mass; r is radius of gyration; w is angular velocity; k is constant
    From the above equation, I derive this relationship:
    w1/w2 = (r2/r1)^2 = t2/t1
    Let: r1 = r, r2 = r + dR, t1 = t, t2 = t + dt
    Substituting the variables:
    ((r + dR)/r)^2 = (t + dt)/t
    Where: r is radius to the center of hemispherical mass; t is average LOD time; dt is change in LOD time
    Solving for dt in the above equation:
    dt = t (1 + dR/r)^2 – t
    t = 86164 s, dR = 1.4 E-3 m, r = 2.4 E+6 m = 3/8 Earth radius
    dt = 0.1 ms

    This means the variation in amplitude from minimum to maximum is 0.1 ms. Look at Figure 1 in the paper. The amplitude ranges from 0.3 to 0.4 ms, more or less. The variation is 0.4 – 0.3 = 0.1 ms. It matches the variation I calculated. My X-Magneto theory predicts the timing and variation in amplitude of LOD. Not bad for a theory named after Magneto of X-Men :-0

  38. To prove my point above that this problem should be treated as a regression, I constructed a regression tree with Cubist (aka “M5”), using the data referenced in Willis’ scatter plot above:
    https://www.geo.fu-berlin.de/en/met/ag/strat/produkte/northpole

    Using only that data, I built a piecewise linear Cubist regression tree to predict the 30hPa temperature, using the sunspot counts and other attributes contained in that file. [A small Python script (below) preprocess the file into the standard Cubist training data format]

    The resulting model has an average error of 2.9 degrees in predicting the 30hPa temperature from the other attributes. The ‘relative error’ is a comparison the prediction errors with a default model using only the target mean as a predictor. Correlation is between predicted the predicted temperature and temperature in the training data.

    Evaluation on training data (270 cases):
    Average |error| 2.9
    Relative |error| 0.29
    Correlation coefficient 0.94

    To mitigate over-fitting of the data, I also ran a 10-fold cross validation, using 90% of the data for training and 10% for blind validation, repeating that 10 times, so that all of the data was used for training and validation. This resulted in a slightly higher average error of 3.2 degrees.

    Summary:
    Average |error| 3.2
    Relative |error| 0.32
    Correlation coefficient 0.93

    Here are the attributes declared in the model, and extracted from the training data. (Pipe symbol ‘|’ denotes comment)

    | Predict stratosphere 30hPa temperature given sunspot number and attributes
    | https://www.geo.fu-berlin.de/en/met/ag/strat/produkte/northpole

    TEMP. | target
    ID: label. | year ID
    RJ: continuous. | Sunspot count
    QBO: east,west,e/w,w/e. | QBO phase
    TRT: early,late. | Transition time
    C: True,False. | Cold flag
    CW: True,False. | Canadian Warmings
    FW: True,False. | Major Final Warming
    STAR: True,False. | Major Mid-winter warming
    MONTH: nov,dec,jan,feb,mar,apr. | Month
    TEMP: continuous. | 30hPa temperature

    | Uncomment these to exclude or include only specific attributes etc.
    | attributes excluded: RJ.
    | attributes included: RJ,QBO,MONTH.

    Here is the script which converts the tab-delimited training data table to a cubist .data file. ‘lbzke.table’ was created simply by using a mouse to cut and paste the html table to a text file. In Chrome, the tab chars are automatically created to delimit table fields correctly. Afterwards, a few records with blank temperatures were deleted.

    import re
    month = [‘nov’,’dec’,’jan’,’feb’,’mar’,’apr’]
    with open(‘lbzke.table’,’r’) as f:
    for line in f:
    word = line.strip().split(‘\t’)
    id = word[0]
    if id[0] != ‘1’: continue
    rj = word[1]
    qbo = word[5]
    trt = word[9]
    t = word[2:5]+word[6:9]

    for i in range(6): #iterate over monthly values
    r=re.findall(r”([-+]?[0-9]*\.?[0-9]+)(.*)”, t[i])
    temp = ‘.’
    flag = ‘.’
    if len(r) == 1:
    temp = r[0][0]
    flag = r[0][1]

    star = (‘*’ in flag)
    cw = (‘CW’ in flag)
    fw = (‘FW’ in flag)
    c = (‘C’ == flag )

    # ID,RJ,QBO,TRT,C,CW,FW,STAR,MON,TEMP
    print “%s_%d,%s,%s,%s,%s,%s,%s,%s,%s,%s” % (id,i+1,rj,qbo,trt,c,cw,fw,star,month[i], temp)

    In using this multivariate regression, the important question is not “Did I count my Bonferronies?”.
    Rather, it is “How much did each variable contribute to the construction of this piece-wise linear regression tree?

    Surprisingly, or not, the sunspot counts and BSO flags were not needed at all to achieve this result:

    Attribute usage:
    Conds Model

    100% MONTH
    90% C
    37% STAR
    11% FW

    So 100% of the regression rules used MONTH, but only 11% used the major-final-warming flag etc. RJ and QBO were available, but not needed to achieve the given results.

    Conclusion
    Therefore, for this dataset, I also reject the sunspot/QBO hypothesis. Just like Willis did. The difference is that my modeling tool told me this up front. Willis’ rejection was based on non-explanatory blind principles (Bonferroni etc), which resulted in a lot of hand waving and polemics.

    BTW, Cubist is available as an R package (as is M5, Quinlan’s original tree regression tool). I use the C command line version because it has more options.

    Here is the complete regression tree, for those who are interested. It has only 8 rules:

    Cubist [Release 2.07 GPL Edition] Wed Feb 27 08:04:24 2019
    ———————————

    Options:
    Application `labitzke’

    Target attribute `TEMP’

    Read 270 cases (10 attributes) from labitzke.data

    Model:

    Rule 1: [43 cases, mean -79.0, range -84 to -76, est err 1.5]

    if
    C = True
    MONTH in {dec, jan}
    then
    TEMP = -79

    Rule 2: [37 cases, mean -73.5, range -83 to -70, est err 2.6]

    if
    C = True
    MONTH in {nov, feb}
    then
    TEMP = -73

    Rule 3: [82 cases, mean -68.0, range -77 to -49, est err 3.8]

    if
    C = False
    STAR = False
    MONTH in {nov, dec, jan, feb}
    then
    TEMP = -68

    Rule 4: [16 cases, mean -65.8, range -79 to -61, est err 3.5]

    if
    C = True
    MONTH = mar
    then
    TEMP = -65

    Rule 5: [18 cases, mean -55.3, range -69 to -38, est err 5.8]

    if
    STAR = True
    MONTH in {nov, dec, jan, feb}
    then
    TEMP = -56

    Rule 6: [20 cases, mean -55.0, range -61 to -45, est err 3.2]

    if
    C = False
    FW = False
    MONTH = mar
    then
    TEMP = -56

    Rule 7: [13 cases, mean -53.7, range -57 to -51, est err 1.9]

    if
    C = True
    MONTH = apr
    then
    TEMP = -53

    Rule 8: [9 cases, mean -48.0, range -53 to -42, est err 3.9]

    if
    FW = True
    MONTH = mar
    then
    TEMP = -48

    Rule 9: [32 cases, mean -44.8, range -50 to -34, est err 2.8]

    if
    C = False
    MONTH = apr
    then
    TEMP = -45

    Evaluation on training data (270 cases):

    Average |error| 2.9
    Relative |error| 0.29
    Correlation coefficient 0.94

    Attribute usage:
    Conds Model

    100% MONTH
    90% C
    37% STAR
    11% FW

    Time: 0.0 secs

    • Hmm, I used the <pre> and </pre> tags to format the code and tables. But it still sucked up all the indenting. So won’t be able to run the Python code, unless you restore the proper indenting. {Left as an exercise for the students]
      😐

  39. ” … complete regression tree, for those who are interested. It has only 8 9 rules … “

    For more info on Cubist and related data-mining tools from Ross Quinlan (an early pioneer in this area):
    https://www.rulequest.com/cubist-info.html
    https://www.rulequest.com/Personal/

    A cogent survey of recursive partitioning classification and regression tree builders, from a distinguished stat.wisc.edu statistician (Loh) who dislikes these tools in general, but is amazed how many researchers still find them useful.
    http://washstat.org/presentations/20150604/loh_slides.pdf

  40. Willis,
    I would be interested in hearing your views on my Cubist experiment, which indicates that there is no support for the solar cycle/QBO hypothesis in the Labitzke datafile you used for your scatter plot.

    I’ve always been skeptical of the solar-cycle connection to terrestrial weather. But this file looked interesting because the temperature “measurements” [actually ‘reanalysis’ forecasts] were made in the mid-stratosphere, where warming due to enhanced UV scattering could conceivably be a little stronger. Even more tantalizing, your scatter plot does show some kind of mild correlation (0.28) between 30hPa temp and sunspot count.

    Actually, I was able to strengthen that correlation, using Cubist, to 0.94, but also demonstrate, in my regression, that the correlation came from the layer attributes in the dataset, not from sunspot counts or QBO.

    I would also like to find the data for the other layers you examined. They don’t appear to be in the other data link you provided.

    Thanks,
    Johanus

  41. Well it took a while to read through the comments to date, but I am pleased to report that I have reached a five star, fur lined, diamond studded, gold plated, ocean going conclusion re the great solar/ non solar debate: the science is not settled.

    • “the science is not settled.”

      Which is good. When science becomes settled, I believe it starts to turn into religion, making it much harder to falsify.

  42. I am using Cubist to further study the structure of the Arctic temperatures in the winter problem. I excluded attributes RJ,C,CW,FW,and STAR, leaving only MONTH and QBO. Obviously MONTH is important for explaining temperatures and I am curious what predictive power QBO has in this ‘northpole’ Labitzke dataset.

    Not surprisingly, the mean error doubled and the correlation fell to 0.77. but still a respectable value. Only 3 rules were generated, which neatly divided this winter regime into 3 segments: {nov,dec,jan,feb}, {mar} and {apr}.

    There is a standard default for creating estimators: if you don’t have the time or skill to write an accurate estimator, then just use the mean value of the data as the first approximation. And that is exactly what Cubist did here, assigning the mean value of each segment as the regression.

    So, even though the temperature varies from -84 to -34, this very simple “bottom line” estimator has a respectable avg error rate of only 5.7 degrees.

    Read 270 cases (10 attributes) from labitzke.data
    Attributes excluded:
    RJ
    C
    CW
    FW
    STAR

    Model:
    Rule 1: [180 cases, mean -70.5, range -84 to -38, est err 6.1]
    if
    MONTH in {nov, dec, jan, feb}
    then
    TEMP = -71
    Rule 2: [45 cases, mean -57.4, range -79 to -42, est err 6.6]
    if
    MONTH = mar
    then
    TEMP = -57
    Rule 3: [45 cases, mean -47.4, range -57 to -34, est err 4.3]
    if
    MONTH = apr
    then
    TEMP = -47
    Evaluation on training data (270 cases):
    Average |error| 5.7
    Relative |error| 0.57
    Correlation coefficient 0.77

    Why is this study limit itself to the winter months. Because of ‘data dredging’ cherry picking? No, I don’t think so. I think the answer involves the so-called stratospheric polar vortex, which only exists during these months. Unlike the tropospheric PV, which completely separated vertically and lives all year long.

    During the winter in the NH, the Sun heats the stratosphere, but not the ground, creating a powerful temperature gradient and the stratsopheric PV.
    Waugh et al., “WHAT IS THE POLAR VORTEX AND HOW DOES IT INFLUENCE WEATHER?”,
    https://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-15-00212.1

    Javier’s concept of the 30hPa geopotential height acting like a thermometer bulb is a key here:

    30 hPa geopotential height for high solar activity years and low solar activity years. The difference is a whopping 240 meters. A quarter of a kilometer. Indicating profound differences in the density and temperature of the air column below 30 hPa all the way to the surface.

    But it doesn’t heat up all the way to the surface, only the top part heats up, because the ground is in relative darkness as this time.

    I think that’s why this solar activity only involves the stratosphere. But the effects on the stratospheric PV may very well extend down to the surface.

    I think the attributes C,CW,STAR, and FW will explain a lot more about this mysterious problem.

  43. Willis & lying in bed you won’t find the sun, north pole or equator.

    Be assured the sun is always out there doing it’s work whatever that would be.

    • Boy, that Johann, he sure used scientific facts and observations to put me in MY place …

      w.

  44. There is another great sunspot to climate correlation don by Scarfeta et al at MIT. They, too received enormous criticism from the alarmist community, even though all they did was show data and mathematics. But hey, data is data what can you do?

Comments are closed.