Spot the trend: $100,000 USD prize to show climate & temperature data is not random

Example of eight random walks in one dimension starting at 0. The plot shows the current position on the line (vertical axis) versus the time steps (horizontal axis). Image: Wikimedia

Example of eight random walks in one dimension starting at 0. The plot shows the current position on the line (vertical axis) versus the time steps (horizontal axis). Image: Wikimedia

Ross McKittrick writes via email:

A UK-based math buff and former investment analyst named Douglas Keenan has posted an intriguing comment on the internet. He takes the view that global temperature series are dominated by randomness and contain no trend, and that existing analyses supposedly showing a significant trend are wrong. He states:

There have been many claims of observational evidence for global-warming alarmism. I have argued that all such claims rely on invalid statistical analyses. Some people, though, have asserted that the analyses are valid. Those people assert, in particular, that they can determine, via statistical analysis, whether global temperatures are increasing more that would be reasonably expected by random natural variation. Those people do not present any counter to my argument, but they make their assertions anyway.

In response to that, I am sponsoring a contest: the prize is $100 000. In essence, the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.

 

He would like such people to substantiate their claim to be able to identify trends. To this end he has posted a file of 1000 time series, some with trends and some without. And…

A prize of $100 000 (one hundred thousand U.S. dollars) will be awarded to the first person, or group of people, who correctly identifies at least 900 series: i.e. which series were generated by a trendless process and which were generated by a trending process.

You have until 30 November 2016 or until someone wins the contest. Each entry costs $10; this is being done to inhibit non-serious entries.

Good luck!

Details here: http://www.informath.org/Contest1000.htm

Advertisements

270 thoughts on “Spot the trend: $100,000 USD prize to show climate & temperature data is not random

      • For baseball the standard deviation of luck is about 0.039 for 162 games, the actual SD for luck is about 0.058.

        The second “luck” should be “talent”:

        For baseball, p = .500 (since the average team must be .500), and g = 162. So the SD of luck works out to about 0.039 (6.36 games per season).
        So SD(performance) = 0.070, and SD(luck) = 0.039. Square those numbers to get var(performance) and var(luck). Then, if luck is independent of talent, we get
        var(performance) = var(talent) + var(luck)
        That means var(talent) equals 0.058 squared, so SD(talent) = 0.058.

    • All I care about is that lovely squiggle indicating the Blue Jays coming from last place, then shooting north of the damnyankees.
      Unfortunately, nice graphs don’t help your team move a runner over from third base with nobody out 🙁

      • I don’t see the squiggle for the Royals, who actually won the World Series !
        Does that mean that the graphs are pretty much random ?
        🙂

      • But if your team is the Yankees they have a higher than average likelihood of being above.500 and that I believe is independent of what Ric is calling talent, teams in Chicago are less likely to win independent of talent.

      • tomwtrevor November 18, 2015 at 6:13 pm
        But if your team is the Yankees they have a higher than average likelihood of being above.500 and that I believe is independent of what Ric is calling talent, teams in Chicago are less likely to win independent of talent.

        I think what you are noting is the talent of the management team in the front office. I may not like the Yankees, but they have some good noses for talent and DEEP POCKETS to pay for acquiring it. That is why they are more likely to have winning seasons…front office talent and money.

    • There is nothing in your “non random” series to suggest to me that they are actually non random. Do some research on fractal geometry.

    • Both of those sets of graphs look a bit contrived to me, which begs the question as to the algorithm that originated them.
      My cause for concern is that all of the steps either up or down are taken at pretty much the exact same slope angles. So the walk wander is dependent only on the length of each segment.
      I would think the outcome of a random walk would depend on the number of degrees of freedom. You say these are one dimensional.
      I would think that truly random moves, would be represented in some fashion by a Gaussian distribution function.
      From ancient times looking at white noise on an analog oscilloscope screen, it seems that the true RMS value of white noise is about 1/6th of what the eye perceives as the most probable guess for the peak to peak amplitude seen on the screen. In more recent times when looking at analog noise generated in analog MOS or CMOS circuits, it seems to be harder to judge what the peak to peak noise seems to be.
      Just when you think you have it pegged, a much larger spike appears out of nowhere to confound you. After you look at CMOS analog noise for a long time, you eventually conclude that it isn’t truly white noise.
      And you would be correct, it isn’t (most of the time).
      MOS transistors produce a decidedly pink noise spectrum, which is due to the presence of a significant 1/f noise component.
      1/f noise seems to have no lower bound to the spectrum, and the amplitude of noise spikes continues to grow at longer and longer time intervals.
      Well I believe that the bottom end of the 1/f noise spectrum was actually the big bang itself.
      1/f noise in MOS transistors is a fairly well understood physical phenomenon that relates to total gate area and the finite electronic charge. It dictates the use of PMOS over NMOS and the use of transistors with very large gate areas, compared to minimum geometries as found in logic gate circuitry.
      I once designed a CMOS analog amplifier which had a 50 Hz 1/f noise corner frequency, along with a 1 MHz 3dB cutoff frequency.
      For most MOS op amps, the 1/f corner is more likely to be 10 kHz.
      I still have a partial wafer of chips, which were only 500 x 600 microns, and only had 3 pin connection pads. Signal input terminal was an on chip MOS photodiode, comprising most of the die area.
      So I’m a bit leery of your walk graphs here; I would expect to see slope changes if it was physically random.
      I seem to recall some random walk problem that involves walking across the gap between two parallel lines. Somehow a Pi or 1/ Pi comes in to it, but I’m damned if I can remember the details of that now.
      I would suspect that Ross has that particular problem framed on the mantel above his living room fire place; maybe below the shot gun.
      g

      • 1/f noise seems to have no lower bound to the spectrum, and the amplitude of noise spikes continues to grow at longer and longer time intervals.
        Also describes global temperature, which means our brief 135 year history tells us nothing about frequencies lower than 1/135…
        I have a graph somewhere that tries to match the spectrum of GISS global temperature and estimates the spectrum below f = 1/135 using extrapolation in the frequency domain. (After all, anyone who says “the temperature is rising due to recent phenomena such as C02 is… well… extrapolating in the frequency domain).
        I then use that extrapolated spectrum to generate pink-like noise Monte Carlo runs and what do you know, the GISS temperature trend for the last 135 years is within the 95% confidence interval of the null hypothesis: i.e. the trend for the last 135 years is likely just noise, as at least 10% of the random pink-like noise runs exceed the GISS trend. This is the much adjusted GISS trend mind you…
        Still not happy with the extrapolation, so I haven’t published yet. Most of the literature tries to generate a AR1 noise to generate a somewhat matching spectrum, but it’s not the same thing, they are just guessing the spectrum indirectly from the time domain instead of just using proper extrapolation in the frequency domain. Another case of the signal analysis folks and the stats folks not talking…
        Of course I admit I’m extrapolating here. Unfortunately the warmists do not…
        Peter

      • George E Smith, on randomness of the Big Bang: “Something that only happens once cannot be rated as to randomness.”
        I am curious how you know the Big Bang happened only once.

      • “””””…..
        Michael 2
        November 19, 2015 at 7:18 am
        George E Smith, on randomness of the Big Bang: “Something that only happens once cannot be rated as to randomness.”
        I am curious how you know the Big Bang happened only once. …..”””””
        Michael 2, my short term memory has gone to pot. Sometimes, I can’t even remember Albert Einstein’s name.
        So refresh me. Just where did I say the big bang only happened once ?? But can I infer that you may have evidence of more than one ?
        g

      • I have my own concept of what ” random ” means. Well that is what it means to me; it is not necessarily a definition that others would accept.
        In my view, Any finite set of sequential finite real numbers, is random, if any subset contains no information that would enable one to deduce the value of the next number in the set (sequence), or even to determine the direction (up, down or sideways) from the last number of that sub set. That definition tolerates no limitation on the magnitude or sign of any number in the sequence; only that they be finite.
        Now that has a problem in that if the numbers can have any finite value, then there can be an astronomical number of numbers between any two no matter how nearly equal.
        Well you mathematicians know how to put that in kosher math form with epsilons and things. So if you like you can quantize the numbers and say that no number can be closer than 1 ppm to another number; or choose your own quantization.
        My point being, the next jump could be a mm or a km, in either direction, or anything in that range.
        Arguably, random noise is the highest information content data stream you can have, in that it is 100% information. Leave out any member of the set, and the remaining set gives NO information as to what the missing number was.
        No I didn’t say the information was of any use. In fact most information is of no real use.
        g >> G

      • So we have at least two wiki disciples. Dunno how I knew of Buffoons needle before wiki or the internet even existed (50 years before.)
        Used to be you were supposed to actually remember what you learn in school. Or at least try my case.
        g

      • I take a view that ‘randomness’ does not exist in nature (anywhere between higgs-boson to the totality of the Universe). No computer can be programmed to produce true ‘randomness’ (in my early days I used noise from a mu-metal shielded resistor). Randomness is product of the human thought.

      • Buffon’s needle is a clever statistical way to calculate Pi. Statistics itself was born out of gaming or gambling starting in the 17th(?), 18th(?) Century by a handful of math wizards.

      • g.e.s. said ” sequential finite real numbers, is random, if any subset contains no information that would enable one to deduce the value of the next number in the set (sequence)”
        which is the information theoretical definition of maximum information entropy.
        it can be hard to distinguish that from randomness, though- e.g., is the decimal notation for pi random?
        and anyway, is not any sequence an infinitely improbable event in the universal set?
        it happens that data compression is a way to measure randomness.
        i think, in theory, if the data samples were compressed, the ones that compressed least must have been the most random to begin with.

      • In the random walk, at a set time interval (t) the walker goes a set distance (d) either in one direction or the other. So the plot will either go up or down, with the same slope every time, slope being d/t or -d/t.

    • Without knowing the date of the trade deadline, could you discern it from the graphs by statistically identifying inflection points?

    • I’m posting this as high on the thread as I can (I’ll post it below, too):
      HOW TO ENTER THIS CONTEST:

      Send an entry to doug dot keenan at informath.org
      He will follow up with payment instructions.
      -Ross {McKitrick}


      Hurrah! #(:))
      (I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!)

  1. The problems is that the raw data has not yet been properly cooked.
    Once cooked it becomes flexible enough to show whatever you want it to show.

  2. Simple logical proof that the temperature data is not random:
    Randomness is not a thing, it does not exist in the physical world and can not cause anything.
    http://wmbriggs.com/post/16260/
    Rather apparent randomness is a measure of lack of information about cause.
    Therefore the temperature data is not random. The data takes the values it does for specific causes, even if we don’t know what the causes are, even if we are incapable of understanding the causes.

    • ‘Randomness is not a thing, it does not exist in the physical world and can not cause anything.’
      Who knows? And how did you define ‘can cause’ and ‘exist’ and ‘random’? Frankly there is no much more deeply confusing thing than the existence of randomness.

      • “Frankly there is no much more deeply confusing thing than the existence of randomness.”
        You are confused because you think randomness exists. Accept that it is just an illusion caused by the lack of information and there is no reason to be confused.

      • Sounds like new age mysticism where all things are relative and there is no such thing as evil.
        I can easily think of many things that would be counted as random. That is; the outcome can not be predicted before hand.
        You’ll need to provided some evidence of your claim that there is no such thing, rather than expecting us to accept your statement that it’s true.

      • Hugs: So the location of individual helium atoms at any moment in a jar doesn’t meet your strict standards for randomness? I know gases aren’t strictly ideal but for engineers, PV=nRT works pretty well and even if they aren’t ideal and you have to correct for the volume taken up by the atoms, their location is still (close enough for any useful purpose) random. It certainly gave us the idea for the concept.

    • I have to agree on prinicple, but you are misunderstanding (or simply ignoring) the point.
      While temperature is deterministic, the fluctuations of temperature are so great and of near-unknown origin that our only possible choice is to add significant random elements to any model. When there are significant random elements creating random walks over relatively short time periods, it is impossible to determine with accuracy what is due to the trend and what is due to chance.
      Probably not the best explanation, and I have to agree that the phrasing is poor, but the point is important to make.

      • “When there are significant random elements”
        There are no random elements, only elements we don’t fully understand, even in the computer models.
        Computer random number generators are not random at all. They only appear random because you don’t know how they work.
        All computer random number generators require a seed value. Usually the system clock time is used so that you have a different seed every time the process runs. However, if you repeatedly seed a random number generator with the same value you will get the same sequence of numbers every single time.

        • ” Computer random number generators are not random at all.”
          Historically this is true, thought I recently read about what is either a true random generator hardware or one that is for practical purposes.

      • That’s why I said “I agree on principle”. However, in practice, these things are random, and it doesn’t really matter what seed you use. As long as you vary it up by some independent function (most functions I know of use the clock as a seed), the lack of true randomness is immaterial.

        • ” However, in practice, these things are random, and it doesn’t really matter what seed you use. As long as you vary it up by some independent function (most functions I know of use the clock as a seed), the lack of true randomness is immaterial.”
          But if you do use the same seed, the sequence of values is the same.

      • Micro, at the risk of sounding like a bad joke, If you want to have random numbers, don’t use the same seed. While it could be useful to check for errors or the effect of changes in a calculation, typically you use some outside variable to seed your data. The best option I know of is your computer’s clock, since it will change continuously and you end up never using the same seed. I’m I’m not mistaken, that’s the default seeding method of most random number generators these days.

    • The question is whether you can show this not just believe it. Arguing over the precise meaning of “random” will probably enter the picture but if it *looks* random that’s good enough. How the universe itself suddenly came into existence may have been random, or not, but it cannot be shown to be one or the other.

      • Randomness implies repetition. Something that only happens once cannot be rated as to randomness.
        g
        The very first Viet Nam war draft lottery based on birth date, was declared to be non random, by some academic statistician.
        With one test instance out a 366! possible draws, he was brazen enough to declare the result non random. Jan1, Jan2, Jan, 3, …..Dec 29, Dec 30, Dec 31 would have the exact same probability as the draw that actually occurred. The particular draw that happened will NEVER happen again, since the odds are so astronomical.
        But it’s an absolute certainty that the statistician was an idiot.
        g
        PS Substitute 366 different icon picture labels for calendar dates to see that our calendar pattern is quite irrelevent to the issue.

      • Well Benford’s Law, which would seem to be nothing more than conjecture, gets hung up on irrelevancies, such as the decimal number system, which just happens to be a label system in the case of say a calendar year.
        Replace the 366 dates with 366 picture icons, and Benford’s law disappears into a cocked hat.
        So what would Benford’s law say about using Roman numerals in our calendar ??
        And why do C and M have ANY significance to Romans, unless they already knew the decimal number system. In which case, why did they expunge zero in favor of just seven; excuse me that’s VII characters ??
        I think I said before, that all of mathematics is fiction, and like other fiction, a whole lot of it is just crap.
        g

      • Well in a calendar year, no month has more than 31 days. So no more than two days a month can start with a 3 and sometimes no days do.
        If you number the days from Jan 1 to Dec 31; as in : 1, 2, 3, 4 ……
        Precisely one day starts with the digit (1).
        Seems like Benford also, was an idiot. Well I wonder what he knows about the Pyramid inch Code for the record of world history as chronicled in the main passage inscriptions in the great Pyramid of Giza. If he is a real expert, he would know where all of the arbitrary scale changes occur, in order to fit the chicken scratchings into a record of historical events.
        g

      • I saw Benford’s law used on Clinton’s income tax returns. There are a very large number of millionaires between 1 and 1.99 million, fewer between 2 and 2.99 and fewer still in numbers higher than these. Naturally you are going to get a lot of ‘1s’ starting the number. Also true for the 6 figure brackets. Benford’s law has been used by forensic accountants to try to determine if books have been cooked. I guess one could learn how it works to confound the forensic guys! In my case, I’m in the smaller numbers side of things anyway.

      • Quantum mechanics can beg all they want, that doesn’t make them right. Just because we are incapable of determining the cause, does not mean that there isn’t a cause. Randomness is an illusion that we impose on ourselves through lack of information.

        • ” Randomness is an illusion that we impose on ourselves through lack of information.”
          On the QM analysis of location vs momentum, it is not a lack of information leading to uncertainty, it truly is uncertain. The explanation as best I know is the lack of reality, go read about the wheeler delayed choice experiment.

      • MattS: Just because we are incapable of determining the cause, does not mean that there isn’t a cause.
        True, but you make a stronger claim than that: There is a cause and the appearance of random variation results from incomplete knowledge.

      • A fan of the legendary Yogi Berra, I see – at least, as regards your first sentence :-). Your second, tho, nails the epistemic issue – a much simpler example than climate time series to think with being radioactive decay of a lump of uranium.

      • The key word there is perceive. It’s an illusion. Knowing that it’s an illusion, doesn’t get rid of the illusion, but you ability to understand the universe will increase once you understand that it is just an illusion. As long as you believe that the apparent randomness is real, you will draw wrong conclusions from your observations.

    • What you are indulging is a confusion of category. Randomness (note the “ness”) is not an entity, it is a quality or behavior of entities. Both entities and their behavior occur in the physical world.
      A very good positive example of the presence of randomness (aside from casting dice and tossing coins), is the kinetic theory of gases, elaborated by James Clerk Maxwell and Ludwig Boltzmann. It leads directly to the gas laws we observe in practice, down to fine detail. Here’s the point: if the gas laws were not based on randomness, we would observe the discrepancy at large scale. Which we haven’t. (There are very small discrepancies at high gas densities, but these amount to violations of the assumptions constraining the randomness, where the volume for the moving molecule becomes less available, and the molecules themselves begin to exert collective forces on each other. These are handled by corrections based on the physical phenomenology.)
      Radioactive decay is another example of a random process, typified by very large numbers of events. Very close correspondence to randomness.
      Statistical mechanics is a very-well-established branch of physics. Statistical processes represent the highest state of entropy, thus typify thermodynamic equilibrium. There is a widespread misconception that information persists eternally, but the reality is that information can be destroyed completely. Among other things, this is why the dead remain dead (zombies and vampires to the contrary).
      Here’s a problem for your point of view: There is no structured process that can generate a random process. The theory of random processes predicts this. There are many “random number” generators, but they generate only pseudo-random numbers.

      • “Here’s a problem for your point of view: There is no structured process that can generate a random process. The theory of random processes predicts this. There are many “random number” generators, but they generate only pseudo-random numbers.”
        Even physical dice and coin flips are only pseudo-random. If you train yourself to flip the coin in a very consistent way, you can significantly bias the outcome.
        This is no problem for my point of view at all. My point of view is that random does not exist in the real world. Random is an illusion that results from lack of information about cause.

      • ” Randomness (note the “ness”) is not an entity, it is a quality or behavior of entities. ”
        No, it is not. Randomness is not a quality or a behavior of entities, it is a measure of the lack of information about the qualities and behaviors of the entities.

      • Dear MattS:
        1) You completely duck the example of statistical gas physics, which is a predictive theory based on the premise that the collisions of gas molecules are entirely random. It matches reality. If reality were otherwise, the theory wouldn’t match. You might want to open a book and learn about it, because your comments indicate you know nothing about it.
        2) You also duck the example of radioactivity. Given that you don’t seem to have much information on these subjects, does that make your understanding “random”? Just askin’….
        3) Coins are not symmetrical. If I gave you a coin without face or tail, and the sides could only be determined by a Geiger counter and the fact that one side was radioactive, I think you would have a hard time biasing the outcome (if you were denied use of the Geiger counter). You wouldn’t even know the income. (And, yes, there is a tiny probability that the coin will land on its edge, but that results from the deviation between the premises of the theory and the reality of implementation. If it is beveled to a sharp edge, this exception will be removed.)
        4) Your “point of view” evidently is based on whatever you have pulled out of your pocket, because it does not conform to the understanding of probability and randomness obtained from either statistical mechanics or information theory…from which we have obtained nuclear energy and advanced communication techniques.
        5) Pi is not a random number. It can be calculated from the Bailey-Borwein-Plouffe formula, which suggests an underlying periodicity. (It is an “irrational” number, but not like some of the exchanges among WUWT participants.) But, more importantly, it is inappropriate to consider it “random” because the concept does not apply. Pi is not a member of a population of events, entities, or observations. Single instances of anything are not “random,” because randomness is only relative to the rest of an observed population.
        It’s all very well to exchange japes. Someone once told me that debates are like tennis games: you play to the level of your opponent. The problem here (as maybe anywhere) is that the self-inflated opinionists have no humility when trying to converse with those who know the subject. That elicits a bad response, if the informed party takes umbrage at the lack of respect and returns service. I hope I have shown that (a) I don’t respect your understanding, (b) I have politely shown you strong counterexamples, and (c) I have no intention of further debate with someone who isn’t interested in learning actual information.

        • James: “The digits of pi are not random.”
          The sequence of the digits of pi passes all known tests for randomity, as do all other transcendental numbers such as phi and e.

      • So Michael J Dunne, how do you propose to make a perfectly symmetrical coin with only one face being radioactive..
        You know of some special isotope that comes in two identical forms but one of them is radioactive and the other isn’t ??
        Wunnerful !
        You could make the coin out of single crystal GaAs (can’t recall which orientation), but one face will have exposed Gallium atoms, and the other face will have exposed arsenic atoms.
        Not perfectly symmetrical, but close enough.

    • Therefore the temperature data is not random.

      You may argue that no sample point in the temperature series is due to random cause and I can see where that makes perfect sense. However, I think you’re missing Keenan’s point. He’s essentially saying falsify the null hypothesis, which holds that any temperature trend observed is by chance (natural variations), using a model that can reliably do so. The 1000 datasets are intended to validate the model’s skill and show that the model’s falsification of the null hypothesis is not by chance within the model itself.

      • Chance is not a cause of natural variations. Nor is it a cause of anything that happens in the model. Nothing does or can happen by chance. Chance, probability, like randomness is a measure of the lack of information, nothing more.

      • You’re still seeing the term “Chance” as describing an underlying process. Chance, as defined by statistics, does not do so. Chance refers to the probability of an interaction between an independent variable and a dependent variable. The chance (read probability) of a model not exhibiting an acceptable goodness of fit such that it can be in error in finding significance when there is none (Type 1 error) or not finding it when it exists (Type II error) is very real. To create a statistical word picture, what is the “chance” of you winning a million dollars on your first pull of a slot machine in Las Vegas?
        I note that your issue is with the term, “randomness.” That is also a term used in statistics but likewise, it does not refer to physical processes. It has a very different meaning and application – randomization of a study, random selection of samples, etc…

      • “You’re still seeing the term “Chance” as describing an underlying process. Chance, as defined by statistics, does not do so. Chance refers to the probability of an interaction between an independent variable and a dependent variable.”
        Exactly. And yet you are the one saying things can happen “by chance”. For anything, even a model result to happen “by chance” chance must be an underlying process with causative power, but it isn’t so stop saying things happen J”by chance”.

      • Read “The Commonplace Thesis.” Then you’ll understand better why the use is in context to probability of outcome and not to describe the underlying process whether it is understood or not. The mathematician, Futuyama, I think best states how we use the term:

        Scientists use chance, or randomness, to mean that when physical causes can result in any of several outcomes, we cannot predict what the outcome will be in any particular case. (Futuyma 2005: 225)

        In this case, Futuyma refers to chance, as it is commonly used in my work, to refer to single-case objective probability. My use of the term is correct in the context it is used. There is no need for me to stop using it.

      • For example, a single coin toss could in theory be modelled using physics, but in practice, minute changes in all the variables lead to an unpredictable outcome.

      • Actually, your coin toss example proves my point. Those minute changes are impossible to fully know, but they are all causative in very specific ways. The unpredictability, the apparent randomness is the result of lack of information about the precise conditions, not an inherent property of the system.

      • Matt S: AP’s first point, what if the causes are random or chaotic is still in play. His choice of a coin flip may not have been a good one. In any case, I trust you accept that use of the concept is of very practical.

    • MattS:
      At the philosphical level your post makes a lot of sense. (Einstein said: “As I have said so many times, God doesn’t play dice with the world.”)
      Contrary to that point of view, though, I wonder if you still maintain the “non-existence” of randomness at the quantum level of nature?

      • “Contrary to that point of view, though, I wonder if you still maintain the “non-existence” of randomness at the quantum level of nature?”
        Yes. Just because we don’t, and by all indications, can’t ever have enough information to predict outcomes at the quantum level, does not mean that outcomes at the quantum level don’t have specific causes.

      • MattS
        November 18, 2015 at 11:21 pm
        Yes. Just because we don’t, and by all indications, can’t ever have enough information to predict outcomes at the quantum level, does not mean that outcomes at the quantum level don’t have specific causes.

        MattS
        Interesting. Would it be correct to say, then, that you believe that physics is (always – at some level) deterministic? And, if so, doesn’t this lead to or imply that there is no such thing as “free will”?
        Not disagreeing with you, but wondering at the implications if reality (whatever that is) is truly deterministic.

      • Well Heisenberg would insist that you cannot even know the state of anything, enough to determine what it will change to next.
        So how could anything in Physics be deterministic ??
        g
        And that’s a good thing.

        • how could anything in Physics be deterministic ??

          Because the macro scale (not named after me) effect is purely (well in particular cases) statistical, and at least in the constraints of the particular system maintains determinism.

      • “Interesting. Would it be correct to say, then, that you believe that physics is (always – at some level) deterministic? And, if so, doesn’t this lead to or imply that there is no such thing as “free will”?”
        Technically yes, but at the lowest levels, it is so complex that only an omniscient being could comprehend it.
        It only implies no such thing as free will if intellect is strictly physical, which it isn’t.

    • So – randomness does not exist, it is just a measure of lack of information. Really? If randomness did exist, how could you distinguish it from lack of information? I’m not exactly saying you’re wrong, just that your assertion looks rather difficult to prove. Maybe you could try the approach that ‘existence of randomness’ is a null theory because it can never be applied, but I suspect that even that isn’t as easy as it might look (eg. error bars can represent randomness).

    • MattS wrote, “Rather apparent randomness is a measure of lack of information about cause.”
      The ratio of information we have (what we know) to information we don’t have (what we don’t know) is so large as to be indistinguishable from zero.

    • Okay, then replace “randomness” with “apparent randomness.” But the fact remains that we lack information about the cause of increases in global temperatures. If we knew the cause (or causes), we could likely predict future global temperatures. The fact that we can’t means we don’t know much about the causes. CO2 may be one of the causes, but it is certainly not the only cause or even the main cause. Until we know what all the causes of global temperature changes are, they might as well be viewed as random because you can’t tell the difference.

      • ” CO2 may be one of the causes, but it is certainly not the only cause or even the main cause. Until we know what all the causes of global temperature changes are, they might as well be viewed as random because you can’t tell the difference.”
        Sure we can, if was Co2, it would on average cool off more at night than it warmed the prior day.

      • Louis: However, with nearly all the energy reaching the earth being from the sun, what we have is a range of temperatures around the TSI value of temperature, impeded by aerosols of varying opacity and enhanced by a period of retention (slowing down of exit from the earth). The TSI itself would be related to the variations of the sun itself. For example, we aren’t going to reach either 300C or absolute zero given the position and TSI of the sun. Now a bombardment by huge bolides could heat us up plenty, but we would eventually return the centralized range based on TSI. The idea of Thermageddon imagined by the nut fringe (Al Gore, etc.) just ain’t going to happen. I believe temperature in particular is not random. The “walk” would have to be restrained between the swing values – perhaps the glacial and interglacial values in the long run record, and smaller ones in the shorter periods under consideration.

        • and enhanced by a period of retention (slowing down of exit from the earth).

          This is from warming of the surface, not a low down from Co2 in the atm. Water vapor in the atm has a slight short term delay, as it cools at night rel humidity goes up, and at high humidity water condensing out of the atm or changes in the optical transmittance of IR to space slows the cooling rate.
          But cooling is almost immediate as the length of day starts to drop.

    • Randomness is not a thing? Quantum mechanics is irreducibly random (via complex numbers, not nice probabilities, which is why it’s so weird), and it’s the most accurate model of reality we have. Chaitin has argued at length for what looks uncommonly like randomness in the whole numbers. His basic definition of randomness is that a number or a sequence is ‘random’ if you cannot compress it. I would dearly love to believe in “hidden variables” underlying QM but the evidence seems to be against.
      To put it another way, if we can’t tell the difference between some measurements and a data set generated by a random model (like 1/f noise), then those measurements might as well *be* random as far as we are concerned. The issue after all is not whether God knows some cause and can discern some trend but whether a trend *we* think we can see is “really” there in some practically important sense.
      I think one of the most important lessons of the twentieth century is that we are very good at seeing patterns that are not really there and then by seeking to confirm them instead of refute them we are very good at locking in false beliefs.
      That’s why (a) this is a really neat challenge (can you reliably tell a real pattern from an illusion when we *know* which is the case, because if you can’t, you have no right to claim the pattern you see in the climate data is real) and (b) I doubt very much whether this challenge will get any serious takers, because people naturally don’t *want* to refute their beliefs but to confirm them. Why would any warmist put their ideas to this test? They “KNOW” they’re right, and those of them with the knowledge to make an attempt would find it less risky to get their next USD 100,000 from some concerned government.

      • ” That’s why (a) this is a really neat challenge (can you reliably tell a real pattern from an illusion when we *know* which is the case, because if you can’t, you have no right to claim the pattern you see in the climate data is real) and (b) I doubt very much whether this challenge will get any serious takers, because people naturally don’t *want* to refute their beliefs but to confirm them. ”
        For a, I didn’t look at his data, but I presume it’s not daily min max data, if it was I think I could detect a trend, the problem with the published data is they’re all doing the same wrong thing.
        For b, BEST is doing something like this, but their made up data is based on the same wrong idea they use to generate the trends that are wrong, and even if they use a clean room process, they use the faithful, which is why the trends are wrong in the first place.

    • MattS: Rather apparent randomness is a measure of lack of information about cause.
      That is intrinsically untestable, and therefore should not be believed or relied upon. To test it you would have to do every experiment twice, once with perfect information about cause, and once with lack of information without cause; and the result would have to be that apparent random variation only occurred with imperfect information. What is known empirically is that random variation has occurred in every experiment, so the most likely induction is that the next experiments will have random variation.

      • ” That is intrinsically untestable, and therefore should not be believed or relied upon. To test it you would have to do every experiment twice, once with perfect information about cause, and once with lack of information without cause; and the result would have to be that apparent random variation only occurred with imperfect information. What is known empirically is that random variation has occurred in every experiment, so the most likely induction is that the next experiments will have random variation.”
        QM is the most tested theory of modern science, and has never been disproven. I would suggest you spend some time studying it. QM uncertainty is not a measurement problem.

      • “What is known empirically is that random variation has occurred in every experiment”
        No, this is not known empirically. What is known empirically that every experiment has had variation in observations due to unknown causes. You call this random, but just because you don’t understand what causes the variation does not mean that it doesn’t have a cause or that the cause is some how “random”

      • Matts,
        While I appreciate your arguments (as I see them) for an ultimately causal universe, I disagree with your insistence that this renders randomness nonexistent, because the word does not actually imply non-causality, as far as I can tell;
        Merriam Webster ~ Definition of RANDOM
        : a haphazard course
        — at random
        : without definite aim, direction, rule, or method <subjects chosen at random
        Earlier you wrote;
        "Even physical dice and coin flips are only pseudo-random. If you train yourself to flip the coin in a very consistent way, you can significantly bias the outcome."
        Then that would not be truly random, but it seems to me the very same person could flip the coin in a way that lacked such an intent aspect, and it would be random.

      • microw6500: QM is the most tested theory of modern science, and has never been disproven. I would suggest you spend some time studying it. QM uncertainty is not a measurement problem.
        That is a good comment. I was not the one who suggested that QM was a “measurement problem”. MattS suggested that apparent randomness was a result of “lack of information”. In QM randomness is treated as a given, and the attempt to distinguish “empirical” randomness (or “epistemological” randomness) from “metaphysical” randomness was abandoned as fruitless (Pauli called it “mere metaphysics”). My assertion is that the assertion by MattS is intrinsically untestable; the one thing we know with more confidence than anything else (except maybe the famous “the sun will rise tomorrow”) is that random variation will always be present. It has been observed more times than the rising of the sun. Random variation is the variation that is non-reproducible and non-predictable — it occurs everywhere, such as in the calibration of measurement instruments; and measuring the effects of the Higgs Boson.

        • . I was not the one who suggested that QM was a “measurement problem”.

          Yes, sorry, mashed post-atoes.

          MattS suggested that apparent randomness was a result of “lack of information”. In QM randomness is treated as a given, and the attempt to distinguish “empirical” randomness (or “epistemological” randomness) from “metaphysical” randomness was abandoned as fruitless (Pauli called it “mere metaphysics”). My assertion is that the assertion by MattS is intrinsically untestable; the one thing we know with more confidence than anything else (except maybe the famous “the sun will rise tomorrow”) is that random variation will always be present. It has been observed more times than the rising of the sun. Random variation is the variation that is non-reproducible and non-predictable — it occurs everywhere, such as in the calibration of measurement instruments; and measuring the effects of the Higgs Boson.

          So, MattS, I strongly disagree that QM is a lack of information problem (as mentioned in a number of posts, again read about Wheelers delayed choice experiment, while not about Quantum Uncertainty, but without a lot of gymnastics is hard to argue that fundamentally reality isn’t real at all.
          But I’m far more likely to agree with you about Earths surface temperature.
          Which does give some hope about models, granted not the models they have, but the models that might be possible. They have a digital timing analyzer for electronic systems that basically doesn’t require patterns, something logic simulators do require. But such an analyzer does require a complete understanding of the exact operation of the system under test, which they don’t have or even begin to understand.

      • MattS: variation in observations due to unknown causes.
        That is the very assertion that I claim is untestable. I did not assert that it is false, but that it is untestable. The presence of random variation (variation that is not predictable and not reproducible) is attested in every attempt to repeat previous experiments, such as confirming the calibration curves of measurement instruments. Whether it is all due to lack of information about causes is untestable.

    • Therefore the temperature data is not random. The data takes the values it does for specific causes, even if we don’t know what the causes are, even if we are incapable of understanding the causes.
      When there are insufficient number of known variables, the process is easily modeled by a pseudo-random process where the result matches the interpolated spectrum of the observed data. Generally statisticians try to use an AR1 model to do this though I argue there are better methods.
      The more you find out about the system, the less that works. Let me know when you can model down to 1km in the global climate models… sometime in the next 200 years or so I think.
      Until then, the random model generates a valid null hypothesis. Which trend the temperature isn’t exceeding, therefore, the null hypothesis, that the last 135 years of temperature trend is merely random variation, hasn’t been falsified.
      BTW this process of Monte Carlo simulation with random data whose spectrum matches the underlying signal is very common in signal analysis and statistics. Lots of papers out there. For example follow all the references for this one:
      http://journals.ametsoc.org/doi/pdf/10.1175/1520-0477%281998%29079%3C0061%3AAPGTWA%3E2.0.CO%3B2
      Peter

        • Yes, the individual temperatures are not random and all have physical cause to vary over time and space but the SUM of of those temperatures will tend towards a random gaussian distribution.(Central limit theorem) Annual Global temperature is not a temperature measurement: it is a humungous average of over a million individual actual measurements. Average = SUM/n

      • the SUM of of those temperatures will tend towards a random gaussian distribution.(Central limit theorem) Annual Global temperature is not a temperature measurement:

        No, not actually, Have you tried running a histogram or an FFT on the data?
        A Gaussian distribution has a certain spectrum of noise. At the very least, there’s strong autocorrelation across 2-3 months, (extremely obvious in the data) and there are numerous other long and short term correlation mechanisms at work, both known and unknown. The global temperature record doesn’t match a Gaussian spectrum.
        Central Limit Theorem applies to INDEPENDENT measurements. In a time series, temperatures are not independent Heck, over SPACE they are not independent. This destroys a pile of assumptions being used by Berkeley Earth, GISS, HadCrut, et al in their attempts to homogenize spotty temperature records. My wild guestimate from some back of the envelope Monte Carlo simulations is that their error bars are 2.5x bigger than they say they are based on spatial autocorrelation alone. I haven’t had a chance to look into problems with finding breaks and stitching things together (see my post elsewhere in on this page), that’ll likely just add to the errorbars. (I’d have to run their entire code set with a newer stitching algorithm, I don’t have the resources to do that right now).
        The CLT is likely the most abused assumption in modern scientific literature.
        Peter

        • Reading your post Peter, made me think of something that keeps nibbling around the edges, measured daily temps are a sine wave plus noise (weather), globally with a average peak to peak range of ~18F, and a period of 24 hours, you’ve lost all the useful info by the time you have a monthly average, and this daily signal is on top of a 12 month period sine wave, with a peak to peak signal of ~80 – 90F.

      • Peter
        I am the author of the bizarrely labelled Checker 22 comment. My brevity is at fault in leaving the impression that CLT is being naively applied. I do understand that it applies to independent measurements and I was also referring to annual global temperature anomaly index calculations, not monthly or shorter periods.
        I will try and restate because I am interested in understanding the problems with this approach.
        An annual global index calculation involves averaging over 1 million individual temperature measurements (2x365x(say) 2000 stations). There are many many local correlations in time and space in that population but likewise there are subsets that are effectively independent. At some stage in the build up to the annual index, we will be averaging September averages in Peking with January averages in London. At this stage we are averaging effectively independent sub-averages and could reasonable expect CLT to have an impact.
        If over a million individual measurements was actually the equivalent to just (arbitrarily)100 independent measurements, which I think is overly pessimistic, the averaging process will still dramatically improve the signal (correlated component) to noise (uncorrelated component ) ratio. Those components of the signal which correlate year to year will survive the averaging process intact. Those components which are effectively independent will turn into a Gaussian mush whose variance will reduce as (i/root(n) where n is the number of effectively independent measurements. It is alleged that the AGW signal is global and effectively immediate and so is the ultimate example of a component of the signal whose S/N ratio would be be improved by a factor of 10 (if n were 100) (linear).
        So, in this case I believe the usual (almost trivial) starting point of time series analysis that the calculated index (global annual temperature anomaly) = signal + noise, the noise being (in the limit) Gaussian and the signal being whatever it is.
        And yes I have seen FFTs and histograms and spectral analysis but of course only ever of the composite (signal plus + noise) and I am the first to agree that the composite is not Gaussian. But that was my point: the composite is not easily separable into its components (which is why we are writing here in the first place) for very low S/N ratios.
        Even after the massive average crunching for the annual index which will have improved the signal to noise ratio, the standard deviation of the year on year variation is still of the order of 0.5 degree whilst the real or imaginary “trends” being calculated over various periods vary around 0.01 to 0.02 degree year on year. A “trend” which is embedded in noise whose year-on-year 1 standard deviation is (say) 30 times bigger than the year-on-year trend, is not really detectable without a lot more independent inputs.
        Jonathan

  3. I’ve had a long standing offer of $2250 ( 1k x the ratio of Venus’s surface temperature to its orbital gray body temperature ) ( and being on my own penny ) for anyone who can provide the physical equations in SI units , and experimental demonstration of the asserted spectral , ie , electromagnetic , phenomenon . If they can do that , then we are both rich and our energy problems are over .
    Of course , HockeySchtick has shown that the temperature differential is a necessary function of the difference in gravitational energy from top to bottom of atmospheres .
    BTW : I will be presenting my 4th.CoSy linguistic environment melding Ken Iverson’s APL with Charles Moore’s Forth at the Silicon Valley FIG “Forth Day” Hangout at Stanford the Saturday . Check my http://CoSy.com for details and to download a free and open down to the x86 system if you have a mind for it .

  4. I hope the rules don’t permit the kind of “homogenization” that has been done to the data being used because they are definitely rehung on an upward trending adjustment series. Having said that, a simple formula for determining the frequency of record temperatures over a given period where the changes are random, is to assume the first year’s average temperature is a record and then calculate Ln (N) where “N” is the number of years in the record (say N=100). This is based on a random distribution of a series of numbers from one to one hundred. Starting from the first entry, you count the next higher temperature in the series and then the next and so on. It seems to be reasonable for snowfall and flood records I’ve looked into.
    Ln(100)= 4.6 (say 4 to give some freedom to using the first as a record). Ln(200)= 5.3, (say 5 increasing records. Ln(500)=~6, Ln(1000)= ~7

    • It’s like the farmer who is selling a horse for $100…this fellow buys the horse, gives the farmer the $100 and tells him that he will pick up the horse in the morning. The next day he shows up to collect the horse. The farmer tells him he can’t do that as the horse died during the night. No worries says the buyer, just give me back my $100. The farmer tells him he can’t since he has spent the money. Ok…says the buyer I will just take the dead horse. The dead horse is loaded up and off goes the buyer. Two weeks later the farmer is in town and he spots the buyer walking down the street. “What happened with the horse”, asks the farmer. “Everything went well and I made $300”, said the buyer. “What…how was that possible ?” asks the farmer. “I raffled off the horse.. sold tickets for $10 apiece”, said the buyer. “What did the winner say when he found out that he had won a dead horse”, asked the farmer. “He did not complain to much since I gave him back his $10″……..

  5. It’s an intriguing idea. This type of approach was used to discredit some of Mickey Mann’s work, feeding white noise into his statistical processes and having it spit out a hockey stick.
    In this case, they produce white noise reflecting the expected amount of variability of the global temperature, using a trendless model. Then they either add or don’t add another random trend of >1 degree or <-1 degree over the 135 data points of each time series.
    If they can use statistical analysis to discern the "anthropogenic signal" in the real-world temperature series, they should be able to do the same for these 1000 model runs.
    This is yet another clear example of how "global warming science" isn't science at all. Real scientists talk to each other using math and statistics. When you ask them to demonstrate the robustness of their math and statistics, they fail.
    Maybe I'm wrong and they'll correctly identify 90% of the time series for what they are and claim the prize. My guess is they never do and most won't bother to even try because they know their statistics are bogus.

  6. The number of hackers trying to decode Answers.txt will probably exceed the number of people trying to solve the problem.

    • Quite reasonably so. Looks easier than trying to figure out what ‘trendless statistical models’ were used and what is meant with ‘trend averaged 1°C/century’.

      • Dear Moderator — summoning you to help out Walt D. (just above) and others who want to enter… . There is no entry form at the informath.org link.
        Thanks!
        And THANK YOU FOR ALL YOU DO FOR TRUTH IN SCIENCE!
        Janice

      • Hey, Walt! Here ya go! #(:))
        HOW TO ENTER THIS CONTEST:

        Send an entry to doug dot keenan at informath.org
        He will follow up with payment instructions.
        -Ross {McKitrick}


        Hurrah! #(:))
        (I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)

  7. Doug Keenan is the challenging sort of guy this science needs, but he can be a pretty tiresome room-mate, I’d imagine.
    “The file Series1000.txt contains 1000 time series. Each series has length 135 (about the same as that of the most commonly studied series of global temperatures). The series were generated via trendless statistical models fit for global temperatures. Some series then had a trend added to them. Each trend averaged 1°C/century—which is greater than the trend claimed for global temperatures. Some trends were positive; others were negative.”
    Each trend AVERAGED 1 deg/century (+ or -). A lot of those trends (90%?) could be pretty close to zero, or at least within the noise.
    Can’t see this thing going very far.

    • “A lot of those trends (90%?) could be pretty close to zero, or at least within the noise.”
      By Jove, I think he’s got it.

    • Each trend averaged 1°C/century. If a series had a trend added to it, the trend was ±1°C/century, when averaged across the series. The trend was not necessarily deterministic-constant throughout the whole series though: it might vary decade by decade, or in some other way.

      • Dear Douglas J. Keenan,

        How does one enter this contest?

        (asking on behalf of Walt D. (2:31pm, today))
        There is no entry-form, just data, apparently (on the informath.org link in article).
        Thank you for adding that info. to the post (I hope!).
        Excellent post, too, by the way!
        Janice

  8. “The series were generated via trendless statistical models fit for global temperatures. Some series then had a trend added to them. Each trend averaged 1°C/century—which is greater than the trend claimed for global temperatures. Some trends were positive; others were negative.”
    No reasonable statistician will fall into this trap.

  9. I’m always struck by the fact that Ken Rice – aka: attp, who posts such negative comments and had a lot to say about this at Bishop Hill, never seems to comment here at WUWT. As he’s from Edinburgh, one could offer a good Scottish word: ‘Frit’.

  10. well, its conceptually an easy problem to solve.
    You discard all the data that has a downward slope as time increases and you keep ONLY that data that has an upward slope over time.
    Then, with the remaining data, you perform your least squares analysis – which, of course will have a correlation coefficient near one – and voila, there you have it.
    True, you have to provide a reason to discard any data you do not like and , once again, the reason is very simple; it was discarded because it contradicted the answer you wanted to obtain.
    But, you may be criticized for providing this stupid reason, and the response to this is straightforward; you would say to the “denier;”
    “You Stupid moron, the science is settled. 97% of ALL scientists agree that climate change is real and, by the way, how much are you being paid by Exxon?”
    Very simple, really.

  11. “There have been many claims of observational evidence for global-warming alarmism.”
    Well yes, any demonstration or opinion piece arguing for lesser use of fossil fuels because of risks associated with climate change is observational evidence of that.
    “I have argued that all such claims rely on invalid statistical analyses.”
    Has anyone done a statistical analysis on the occurance of global-warming alarmism? Is there a reference somewhere?
    “Some people, though, have asserted that the analyses are valid. Those people assert, in particular, that they can determine, via statistical analysis, whether global temperatures are increasing more that would be reasonably expected by random natural variation. Those people do not present any counter to my argument, but they make their assertions anyway.”
    (Now seriously) Do the people you disagree with think that your statistical model is physically plausible?

  12. I’ve been saying for a couple years, surface stations, when you compare daily rising temp to the following nights falling temp, show no trend in rising temps, and has a slight Falling trend, but is smaller than the uncertainty of the measurements.

  13. cooling the past, warming the present….using a algorithm that adjusts the past every time you enter a new set of temperatures etc….is not random

    • And here is just one of many WUWT posts (and comment threads) which backs up what Latitude said (at 2:15pm today):

      … the growing difference is strong evidence of bias in the computation of the surface record. This bias is not really surprising, given that every new version of HadCRUT and GISS has had the overall effect of cooling the past and/or warming the present! This is as unlikely as flipping a coin (at this point) ten or twelve times each, and having it come up heads every time for both products. …

      Werner Brozek here:
      http://wattsupwiththat.com/2015/08/14/problematic-adjustments-and-divergences-now-includes-june-data/
      by Werner Brozek and R. G. Brown (ed. by Just the Facts)

  14. Statisticians can prove anything they want.
    What is more important is to identify the effects, beneficial or otherwise,on us of whatever temperature changes occur, whether statistically significant or not

  15. I sense this is a followup to Keenan’s roil with MET where he challenged MET’s AR1 model to make the point MET’s model was incapable of determining statistical significance in the temperature series. He proved to MET (and Parliament) he was right. This challenge and prize is essentially an extension of what he already demonstrated. This is not a challenge I would spend time on trying to win the prize as I am confident, once again, he is right.

  16. I have not understood what Mr. K. said. There is a thing which I know : it is getting warmer.Did you
    notice?

  17. Temperature is not random, just an output to an incomprehensible number of inputs. The contest proposed is not a solvable puzzle (in my lifetime at least) and Ross knows this but it is the antithesis of the null hypothesis. If you cannot prove randomness, then prove non-randomness, a brain tease if you will. Unsolvable!

  18. All the energy in the Earth surface/atmosphere system comes from the Sun (a small amount comes from the continuing very slowly cooling of the interior – and all the atoms in the Earth system contain an unknown amount of energy bound up in them), but …
    Energy In (Sun) – Energy Out Delayed (OLR and Albedo) = Energy surface/atmosphere = Temperature
    It is NOT an easy thing to calculate/measure these components (let’s say impossible) and the resulting temperature will therefore appear random. Even the Delay component has never even been guessed at by anyone. It can actually be approximated in hours.
    Climate science assumes “Energy In” is constant (give or take a small solar cycle which has no impact anyway) and “Energy Out (Albedo)” hardly changes at all except when CO2 melts ice over a long period of time, and “Energy Out (OLR)” only varies because of CO2/GHGs. They don’t even think about the “Time” dimension inherent in the equation at all. There is no real climate models, there is only a way of “thinking” about the equation.
    What would be the temperature at the surface without the Sun? 3.0K or so one could guess.
    Instead, it is 288.0K and, in Earth history since the end of the late heavy bombardment, this value has only varied between 263K and 300K. What is the main determinant of why it has only varied between these two values (answer, the Sun has warmed some over the eons but it is really the Albedo component that has been the driver of a Hot Earth or a Cold Earth. Why would it be any different on shorter timescales.

    • ” What is the main determinant of why it has only varied between these two values (answer, the Sun has warmed some over the eons but it is really the Albedo component that has been the driver of a Hot Earth or a Cold Earth. Why would it be any different on shorter timescales.”
      Every night the Sun goes down, and out of the tropics the length of the changes, and as the day shorten the temp drops daily. And on clear low humidity nights, it drops till morning, and over a year an average of all stations that record for a full year, when averaged since 1940 it cools slightly more than it warms the prior day.
      The energy balance at these stations shows no warming trend.

  19. Has anyone found out where people can go to submit answers? I was just goofing off, and I came up with a list as a guess at an answer. I don’t know if I’d want to spend $10 given how quickly I put the list together, but even if I did, I don’t see how I’d go about doing it.

    • Yes! #(:))
      HOW TO ENTER THIS CONTEST:

      Send an entry to doug dot keenan at informath.org
      He will follow up with payment instructions.
      -Ross {McKitrick}


      Hurrah! #(:))
      (I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)

      • Wait, what? The rules specifically say entries must be accompanied by the entry fee. In what world does that mean people must e-mail him then wait for a response telling them how they can submit an entry fee? By definition, that means the method for entering the contest hasn’t been disclosed.

  20. This does not matter. I say this because Paul Ehrlich is a Stanford professor and environmentalist celebrity and not a retired shoe salesman.
    So, basically, being proven wrong means nothing in the scientific and scientific celebrity community. Which means that the community does not exist on a functional level, but there you go.

  21. Dear Ross,
    I would like to direct your attention to a multitude of previous campaigns like this one, and their rather poor performance. (I’ll leave it like that, unsupported by facts.)
    And be bold enough to suggest a better approach. In the time of the net, to get good attention. You need there to be something; for the supporters to do. Something that shows the support, and makes both it and the challenge newsworthy.
    So I suggest you make it so people can fund others tries, let’s say I could fund M. Mann’s try at the 100k$, and you could send him an email saying he had one free try. Copy that to some news org. You do that every time somebody gets sponsored, and for decency and news worth, add that 10$ to the prize money. Get one of them counters here at WUWT, and a “one click donate” you’re “favorite” climate change researcher or add other. The possibility for spinn and press are huge. (M. Mann has so far “collected” xx$ for: “The search for certainty” ).
    And who knows, it might grow on you.

    • James, you will have to do something, MUCH, actually, to:
      1) qualify your expert witness, Grant Foster, a.k.a., “Tamino;” and
      2) to rehabilitate his credibility
      if you want us to take anything he writes seriously.
      Re: Foster’s Qualification as an Expert:
      “Rahmstorf et al {Grant Foster} (2012) assume the effects of La Niñas on global surface temperatures are the proportional to the effects of El Niño events. They are not. Anyone who is capable of reading a graph can see and understand this.
      Bob Tisdale here: http://wattsupwiththat.com/2012/11/28/mythbusting-rahmstorf-and-foster/
      I’ll take Bob Tisdale’s analysis over “Tamino’s” any day.
      Your “expert’s” credibility is also questionable:
      I find it difficult to believe that something so obvious is simply overlooked by climate scientists and those who peer review papers such as Rahmstorf {and Foster} (2012). Some readers might think the authors are intentionally being misleading. ***
      The sea surface temperature records contradict the findings of Rahmstorf et al {Foster} (2012). There is no evidence of a CO2-driven anthropogenic global warming component in the satellite-era sea surface temperature records. Each time climate scientists (and statisticians) attempt to continue this myth, they lose more and more…and more…credibility.
      Bob Tisdale (Ibid.)
      And this is just ONE example. Typical, according to WUWT commenters on this thread: http://wattsupwiththat.com/2014/10/12/is-taminos-open-mind-dead-or-just-in-an-extended-pause/

      • So it’s fair to say that you believe the best model for climate is one which allows for unmitigated, unpredictable swings to unbounded extremes? You must be a huge proponent of extreme adaptation and mitigation spending.

    • Science does not work by proof. It works by falsification, which lends evidence to support conclusions. To suggest that “proving climate is not a random walk is not [unsolvable]” is philosophy.

      • Science does not, mathematics does. The proposed problem is purely mathematical with no physical basis. QED.

      • There are some who disagree with this premise – that science can’t prove anything. Some would be wrong. Science is based on experiments and observations which can arrive at flawed conclusions that are well accepted until some time in the future when the flaw is discovered.
        The basic test of any scientific theory is whether or not it can be falsified. If there is no proposed method to falsify a theory then is it not truly a scientific theory. Experiments and their measurements can only disprove a theory or be consistent with it.
        Karl Popper demonstrates quite clearly in his book, “The Logic of Scientific Discovery,” why science can’t prove anything. Give it a read and I trust it will make sense.

    • Yes, the proposed puzzle is unsolvable, but proving that climate is not a random walk is not. https://tamino.wordpress.com/2010/03/11/not-a-random-walk/
      More “stats folks need to talk to signal processing” folks problems in this writeup. The stats folks are making assumptions about what the noise likes like at frequencies smaller than 1/135 (or more accurately about 4/135 due to Nyquist).
      Since the data simply doesn’t exist for > 135 years they are extrapolating. That’s also called guessing, and in this case not even educated guessing. In fact it’s bad guessing, there’s lots of papers showing multi-hundred and thousand year oscillations.
      Also the author ignores the possibility that the “bounded” temperature may have bounds far outside the last 135 years. In fact the proxy records seems to indicate as much.
      Also I find the AR model to be just a lame way of doing interpolation in the frequency domain, when, in fact, you could just translate the original signal to the frequency domain and do fairly basic, statistically sound interpolation and smoothing (albeit on complex numbers), and then apply that result to white noise to shape it. Haven’t found a good paper on that, I’m a little bit in wild territory on this idea…
      I’m going to look at the challenge data in the frequency domain. This should be fun, the author of the above challenge is a statistician and probably doesn’t talk to signal processing folks. I might catch something and make some cash. You never know till you try…
      Peter

      • How sad for my dreams of riches. My attempt to recreate Doug’s process showed a clear difference in a complex plot of an FFT, but Doug Keenan’s did not 🙁 :-(. I got quite excited for a few hours.
        My pinknoise generator is not generating the same phase relationships as Doug’s… Doug’s are highly regular and uniform and look exactly likely that of a ramp we are looking for…
        Back to the drawing board.
        Peter

  22. the prize is $100 000. In essence, the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.
    Demonstrations, like proofs, depend on assumptions: make the appropriate assumptions, and the results follow. Anybody can experiment a lot, and then submit the demonstration, out of the many, that satisfies the requirement. He must have some restrictions on what assumptions are acceptable. Otherwise lots of people will win the prize. Note the wording: the increase in global temperatures is probably not due to random natural variation; a conclusion like that depends on the prior probability that the increase is not due to chance. What prior distributions on that outcome are permitted?

    • While the series were generated randomly, were they filtered in a non random fashion? For exmple, based on their trend? This would effectively stack the deck if 3 different filters were used to match the 3 different sets of results.

    • The challenge is to use their statistical prowess to separate out the inherent noise of the climate system from the “anthropogenic signature” that they claim has contributed less than 1 degree of warming over the last 100+ years.
      The details of the challenge are described at the website. He use a Global Temperature model to generated 1000 random time series that have no trend for 100+ years, then added a trend to a subset of them that averages out to >+1 degree or <-1 degree over the length of the time series.
      If they can correctly identify at least 900 of those time series as having a trend introduced or not, they win.

      • generated 1000 random time series that have no trend for 100+ years, then added a trend to a subset of them that averages out to >+1 degree or <-1 degree over the length of the time series.

        You have not plotted the data have you… also AR models DO produce trends. That’s the whole point of the exercise. There are naturally occurring trends. How do you know if we have a natural trend or an aC02 caused trend? a CAGWer can get a $100k if they know how to do this. If they don’t, they have no business blaming aC02…
        I’ll also note in passing this also applies to ANYONE correlating to temperature. For example the Solar folks… most people have a real hard time with WE DON’T KNOW.
        Peter

    • Here’s how! #(:))
      HOW TO ENTER THIS CONTEST:

      Send an entry to doug dot keenan at informath.org
      He will follow up with payment instructions.
      -Ross {McKitrick}


      Hurrah! #(:))
      (I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)

  23. http://www.informath.org/AR5stat.pdf
    Wow, this paper is basically a summation of what I spent the last 6 months futzing around with and have posted brief glimpses. Thanks for publishing this article Anthony, good stuff. I might be inspired to attempt to finish the work I describe above on what a confidence interval for a trend of ~1/f noise should look like.
    Unfortunately the paper is mostly in english and not detailed technical..so replicating it will be hard. The author doesn’t propose a correct model, only points out the one in IPCC AR4/AR5 is wrong.
    Peter

    • While Keenan does not state his preferred model in the paper (although he alludes to it on page 5), he has presented it in his testimony before the UK Parliament in Lord Donahue’s inquiry of MET’s model and its ability to show statistical significance in the temperature series.
      Keenan’s preferred model in this case is a driftless ARIMA(3,1,0). Met and the IPCC use an ARIMA(1,0,0) model that has been shown to drift as it lacks an integrative term and is too narrowly scoped in its autoregressive term.

      • Doug, my apology for poor wording. I recalled that you offered ARIMA(3,1,0) as a favored model against MET’s trending model as stated your guest post at Bishop Hill:

        The op-ed piece includes a technical supplement, which describes one other statistical model in particular: a driftless ARIMA(3,1,0) model (again, unfamiliarity with the model does not matter here). The supplement demonstrates that the likelihood of the driftless model is about 1000 times that of the trending autoregressive model. Thus the model used by HM Government should be rejected, in favor of the driftless model.

        That is the case I made reference to. I realize that what model is chosen for a given time series is dependent upon testing with AICc, BIC, ACF, PACF lag correlations and more. Thus, no, you would not wholesale advocate using any particular model and for good reason. I would not intentionally make the mistake in saying you would.

      • Hank, I appreciate that your comment was an honest one. The quote you include says that a driftless ARIMA(3,1,0) model is far better than the IPCC model, and so we should reject the IPCC model (which was the conclusion that I was seeking). The quote does not say that the ARIMA(3,1,0) model is any good in an absolute sense. I really have no idea what model we should accept. For a long discussion of all this, see my critique of the IPCC statistical analysis (linked on the contest web page, in the section on HMG).

  24. Information and entropy. The more entropy the less information. So yes randomness is lack of information. Any recent (post ’60s – maybe earlier) study of thermo would give the equations.

  25. About submitting a contest entry, this can be done by sending me an e-mail. The contest web page has been revised to state that.
    @egFinn   That is an interesting idea, and might well be worth pursuing. I want to think about it some more. Kind thanks for suggesting it.

    • Any finite time series, any finite shape can be produced by a random process. For this reason one cannot distinguish between random and not random sequence of numbers. I think I do not understand the premise of the contest.

      • Any finite time series, any finite shape can be produced by a random process. For this reason one cannot distinguish between random and not random sequence of numbers. I think I do not understand the premise of the contest.

        I finally get to reply to the other Peter…
        In statistics of a time series and signal processing, when the underlying causes of a signal are unknowable, the null hypothesis against “do I have a correlatable or otherwise useful signal” is to check to see if the signal is significantly different from random noise of the same spectrum, because when you have underlying causes from large numbers of random variables, it’s in effect a random process.
        The null hypothesis for “is there a significant signal in the global temperature record” is to test against a Monte Carlo simulation of a random process of equivalent spectrum in order to determine if it’s just due to variation in the variables you don’t know about. Some example of “known unknowns” are solar variation, stored heat oscillations in the oceans, oscillations in ice coverage between the arctic and antarctic, PDO, etc, all stuff that’s been speculated about for years but aren’t measurable due to all the confounding other variables or lack of data. You can only make a valid conclusion about the entire group of unknown variables (aka the global temperature), the individual components aren’t distinguishable. Then of course there’s all the unknown unknowns…
        Here’s a paper that uses this technique to find that indeed, the ENSO signal is significantly different from random noise. Also note there’s no other signal there…
        http://journals.ametsoc.org/doi/pdf/10.1175/1520-0477%281998%29079%3C0061%3AAPGTWA%3E2.0.CO%3B2
        You can follow the references in that paper if you like to see who originated this general idea and prove its validity. Alas I keep getting stuck behind paywalls and I gave up trying to find the original paper. Such a tragedy that general human knowledge that taxpayers paid for is not available to the common man. I’m also sad the author of this contest didn’t cite the original paper in his letter to his government.
        Compounding this problem is that the lower frequency the signal compared to your sample, the more likely it’s noise. You are also extrapolating, because you don’t know what the noise is at frequencies lower than the inverse of ~1/5 the length of your sample (Nyquist). There’s no lower frequency than the trend drawn through your data…. That’s why I laugh every time I see a trend line drawn through temperature data. Including Lord Monckton’s 18yr9mo “no trend” graphs (though he’s just hoisting the warmists by their own faulty petard).
        Peter

      • Sorry, I forgot to relate this to what the challenger did.
        The challenger threw down the gauntlet and said “let’s see if you can invalidate the null hypothesis that the 135 years of temperature is just noise”, at the 90% confidence interval (100 out of 1000 have real trends). If you can answer his challenge, then you can also prove that the temperature record has a statistically significant trend. (let’s not argue whether the trend is drylabbing at this point, it doesn’t matter for this particular statistical argument.).
        Personally I accept the challenge not because I don’t agree with the challenger’s hypothesis, but rather he’s a statistician and I’m a signal processor, and I might be able to find a flaw in his generation of random signals ;-). I could use $100k to fund that around the world surf trip I always wanted to take.
        Peter

    • And, one more time (well, Mr. Keenan — I didn’t read your 11/19, 0419 post until a few seconds ago and… I went to a lot of trouble, so… I’m getting my posting money’s worth, heh) — just for the convenience of WUWT commenters:
      HOW TO ENTER THIS CONTEST:

      Send an entry to doug dot keenan at informath.org
      He will follow up with payment instructions.
      -Ross {McKitrick}


      Hurrah! #(:))
      (I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)
      Addendum: See Douglas Keenan’s post at 0419 today — he put entry info. on the contest site.

  26. In Excel with years 1880; 1851 … etc. down column A and a seed number* in Cell B1, this formula:
    =B1+(RAND()-0.5)*0.15
    entered in Cell B2 copied on down and graphed out, generates a curve that often looks darn similar to GISS or any other world temperature timeline graph.
    *The anomaly value from the base for the first year.

  27. @Keenan
    I wonder if you have a way to find the trended/non trended series without knowledge of the process.
    And would it be possible even with knowledge of the process?
    I like the idea of the contest, it is funny and interesting for the coupling to the real world.

    • Dear Mr. Ferdinandsen,
      Try e mailing Douglas Keenan here: doug.keenan@informath.org.
      Best wishes for a successful contest entry! (oh, boy, do I admire all of you who can even make a reasonable attempt to do that! — I’d have to take down about 5 books and study for months….)
      Janice

    • Svend Ferdinandsen November 19, 2015 at 7:11 am
      @Keenan
      I wonder if you have a way to find the trended/non trended series without knowledge of the process.
      And would it be possible even with knowledge of the process?
      I like the idea of the contest, it is funny and interesting for the coupling to the real world.

      I have occasionally, over the past several years, wondered about what may be similar to the same thing that Svend Ferdinandsen seems (to me) to suggest. Sorry in advance Svend Ferdinandsen if I misinterpreted your comment.
      Can one determine if some sets of number are times series at all if one is only given the sets of numbers without any reference to what/how the numbers are derived or measured from or whether they are times series? I guess that depends on a basic question, do time series have features that uniquely and unambiguously distinguish them as time series without knowing they are time series?
      The question interests me in a ‘dectective’ kind of perspective.
      John

  28. And according to the Essex et al paper, average temperatures – because they are numbers, they can be averaged – can have no physical meaning.

  29. If I were to give a number to a thousand places such as 2.03048154248…. and which never repeated itself and there was never any clue what the next number would be, would that then be a random number? It would appear that way. However if I then told you that this number is merely pi with all numbers decreased by 1, would it still be a random number?
    (pi = 3.14159265359…)

  30. Doug (Keenan), first, thanks for an interesting challenge. I agree with you that the generally used statistics are wildly deceptive when applied to observational data such as global temperature averages. However, it is not clear what you mean when you say that the datasets were generated via “trendless statistical models”.
    From an examination of the data, it appears that you are using “trendless” to mean a statistical model which generates data which may contain what might be called a trend, but the trends average to zero. Your data is strongly autocorrelated (as is much observational data). Random “red noise” data of this type contains much larger trends, in both length and amplitude, than does “white noise” data.
    Using a single-depth ARMA model, your average AR is 0.93, which is quite high. Data generated using high-AR
    But then, without a “bright-line” definition of whatever it is that you are calling a trend, it’s hard to tell.
    Next, “trend” in general is taken as meaning the slope of the linear regression line of the data. Viewed from this perspective, it is clear that ANY dataset can be decomposed into a linear trend component and a “detrended” residual component.
    Since from this perspective all datasets contain a trend, it’s obvious that your challenge (determine which of these contain an added trend) is very possibly not solvable as it stands. Fascinating nonetheless.
    You may have given your definition of a “trend” and a “trendless statistical model” elsewhere, in which case a link (and a page number if necessary) would be greatly appreciated.
    A final question—were the trends added to random datasets, or to chosen datasets?
    Thanks for the fun,
    w.
    PS—using only 135 data points and asking for 90% accuracy? Most datasets are monthly, so they are on the order of 1500 data points. Next, we don’t need 90% accuracy. We just need to be a measurable amount better than random, whatever that might be. A more interesting test would be 1000 datasets from the SAME “trendless statistical model” with a length of 1500 data points or so, with half of the thousand having a trend added. That would let us compare different methods of investigating the problem.

    • Willis,
      Taking your points in turn….
      A trendless statistical model is a statistical model that does not incorporate a trend.
      AR coefficients should only be calculated after detrending, otherwise they will tend to be too high. Calculating the AR(1) coefficient using your method on HadCRUT spanning 1880–2014 (135 years) gives 0.92, which is essentially the same as what you got for the simulated data.
      There is a standard definition of “trend” in time series analysis. In particular, trends can be stochastic, as well as deterministic. The standard reference is
      Time Series Analysis by Hamilton (1994); for trends, see chapter 15.
      Regarding your claim that “all datasets contain a trend”, that is not true. Rather, no statistical data set contains a trend. Instead, statistical models of a data set may or may not contain a trend.
      About your final question, the contest web page should have said that the series were randomly-chosen. I have revised the page. Kind thanks for pointing this out.

      • Douglas J. Keenan November 19, 2015 at 3:21 pm

        Willis,
        Taking your points in turn….
        A trendless statistical model is a statistical model that does not incorporate a trend.

        Can a trendless statistical model produce trended data? Depends on your definition of “trend”.

        AR coefficients should only be calculated after detrending, otherwise they will tend to be too high. Calculating the AR(1) coefficient using your method on HadCRUT spanning 1880–2014 (135 years) gives 0.92, which is essentially the same as what you got for the simulated data.

        OK.

        There is a standard definition of “trend” in time series analysis. In particular, trends can be stochastic, as well as deterministic. The standard reference is
        Time Series Analysis by Hamilton (1994); for trends, see chapter 15.

        I’m afraid that doesn’t help, as I don’t have the text you refer to. Perhaps you could define for us, in an unambiguous way, exactly what you mean by a trend, because in this case, I’m more interested in what YOU call a trend. This is particularly true since below you say that no dataset has a trend … if so, then what are you calling a trend? For example, you say:

        Some series then had a trend added to them.

        But you also say:

        Regarding your claim that “all datasets contain a trend”, that is not true. Rather, no statistical data set contains a trend. Instead, statistical models of a data set may or may not contain a trend.

        If that is the case then how can you possibly add a trend to a series, as in your quote above?

        About your final question, the contest web page should have said that the series were randomly-chosen. I have revised the page. Kind thanks for pointing this out.

        Thanks, I’d assumed that, just wanted to check.
        I don’t think I can identify the trended data with 90% accuracy. However, to my surprise it is possible to distinguish between trended and untrended data at least part of the time. You giving any prizes for say 60% accuracy?
        Many thanks,
        w.

    • Random “red noise” data of this type contains much larger trends, in both length and amplitude, than does “white noise” data.

      And an additional complication that statisticians IMHO are ignoring: There are trends from frequencies smaller than 1/data_length. They are there in the real world, they show up as a trend in any subsample of a data set, but AR models won’t generate them because the AR models generate no signal at frequencies below 1/data_length (it’s actually a rolloff curve across frequencies starting at some constant k/data_length).
      Examples of some trends that AR models wont’ generate on a 135 history of temperature: The alleged year 1000 cycle, which can be seen in some proxy records but we don’t have accurate magnitude of. We are pretty sure it’s there though. There are also cycles between 100 and 1000 years that show up in proxy records. They’ll show up as trends, but if you make an AR model with 135 years you aren’t generating them.
      The slope of the trend you are generating has a distribution you aren’t directly controlling for. IMHO you are guessing. Try running a Monte Carlo simulation and generate thousands of AR models and generate a histogram of the trends. Then tell me why that histogram is an accurate representation of real world trends. Given the fact that you aren’t generating trends that are greater than 1/data_length you are likely underestimating the width of that distribution.
      You can also test this by generating 8x length of the original data length with your AR model and then taking the datalength from the middle of that set of data. You’ll find that magic number of 8x maximizes the width of trend distribution as compared to 4 or 16.
      Peter

    • All data sets contain a trend Willis, because trend has nothing to do with the data set. It is something that is defined by a statistical mathematics algorithm.
      So it works on any data set, regardless of the nature of the numbers in that data set.
      You could pick up a Sunday news paper, and start from the front top line, and read through, and simply collect every number found anywhere and you have your data set.
      Apply the stat algorithm, and it will give you the trend for that data set.
      G

  31. It would be interesting to see which of the very dogmatic, often abusive AGW aficionados DOESN’T have enough confidence in their understanding of the maths and physics involved to stump up the $10 and enter this contest…
    Any suggestions?

  32. When Doug Keenan talks of “random natural variation,” he seems to have a very restrictive AR(1) “random walk” in mind. Such Markov processes have power densities that decline monotonically with frequency . In fact, natural variation of temperature manifests much more structured spectral characteristics, with pronounced peaks and valleys. These can produce far more systematic rises and falls over limited time-spans, which are mistaken as “linear trends” by signal analysis amateurs. The trick to Keenan’s challenge is the very limited time-span of his simulations of random walks coupled with the introduction of non-linear artificial trends. His money thus seems to be safe.

      • Doug:
        It’s the random walks illustrated in your figure that prompted my speculation. Random natural variation is far better illustrated by a Gaussian random phase process, which does not produce the appearance of secular divergence from a common starting point, as in AR(1) diffusion.

    • Are they mistaken as short-lived linear trends, or are they actually short term linear trends? You appear to be claiming that there is some process which leads to short term linear trends but then chastising “amateurs” for finding them and saying: “Look, there is a short term linear trend here!”.

      • A linear trend can always be fit to ANY time series, including arbitrary stretches of totally trendless random processes of wide variety. This computational exercise reveals virtually nothing, however, about the inherent structural characteristics of the process. It’s the lack of recognition of these analytic truths that betrays amateurs, who fall prey to the presumption that the obtained “trend” is an inherent property, instead of a mere phenomenological feature, of the process. Linear regression simply is not an incisive time-series analysis tool.

        • 1sky1: “Linear regression simply is not an incisive time-series analysis tool.”
          Aside from the fact that linear regression produces a straight line – a phenomenon almost totally unknown in nature – from what are generally clearly cyclic processes.
          In climate “science”, a suitable portion of the cyclic function is carefully cherry-picked, the resulting linear regression line is then extrapolated to Armageddon, and we all know what happens next.

      • catweazle666:
        Indeed, as an analytic example, sin(x) is well approximated, for small x, simply by x. The R^2 of the fitted linear “trend” then can be made arbitrarily close to unity with a short-enough records.

  33. To Janice Moore and John Whitman
    I think Keenan would not reveal if he could solve the puzzle, and i tend to believe it is not possible, but maybe. I am not going to try, my skills are not up to that, but the quest is intriguing.
    Is it a time series or not, and what is the difference? I don’t see any difference, any random series of numbers could be a time series or several or just random numbers whith some filtering/processing.
    Electrical “white” noise is a good analogy (for me at least). If you sample it faster than the bandwith, each sample depends a little on the earlier samples and a little on “real” noise, but noise it is.
    Noise is what you can not foresee, wether it is lack of knowledge or understanding.
    The temperature of the real world is not a single lowpass filter on white noise, but the combination of a lot of filters on a lot of drivers with their own filters with vastly varying timescales, and then comes the interdepencies. Climate in a nutshell.
    In digital communication you define the signal to noise as the energy in each bit relative to kT. The bit could last for micoseconds or years, but the relation holds. It is all about energy.
    I know of some tests for chaos, and different temperature series have been found to excibit chaos at any timescale up to the lenght of the series.
    I really like the idea to give Mann a free trial with his amazing software. 🙂

  34. The prize goes to:

    The Annals of Applied Statistics
    Ann. Appl. Stat.
    Volume 8, Number 3 (2014), 1372-1394.
    Change points and temporal dependence in reconstructions of annual temperature: Did Europe experience a Little Ice Age?
    Morgan Kelly and Cormac Ó Gráda
    We analyze the timing and extent of Northern European temperature falls during the Little Ice Age, using standard temperature reconstructions. However, we can find little evidence of temporal dependence or structural breaks in European weather before the twentieth century. Instead, European weather between the fifteenth and nineteenth centuries resembles uncorrelated draws from a distribution with a constant mean (although there are occasional decades of markedly lower summer temperature) and variance, with the same behavior holding more tentatively back to the twelfth century. Our results suggest that observed conditions during the Little Ice Age in Northern Europe are consistent with random climate variability. The existing consensus about apparent cold conditions may stem in part from a Slutsky effect, where smoothing data gives the spurious appearance of irregular oscillations when the underlying time series is white noise.

    You have to read the whole article to find out that they identify “structural breaks” in the late 19th and early 20th centuries, a different year for each temperature series. If the “pre-break” temperature series adequately satisfy “random variation” then the “trends” since the breaks are “statistically significant” in each temperature series. On this analysis, the “Little Ice Age” label applies equally well throughout the interval between the Medieval Warm Period and the 20th century warm period.
    You can put this into a hierarchical framework in which the “breaks” appear to have happened “randomly”, in which case the authors would not win the prize.

    • You have to read the whole article to find out that they identify “structural breaks” in the late 19th and early 20th centuries,

      AAAND the statistical methods for finding structural breaks are flawed in the presence of non-white noise. Too many false positives. Whooops. Something I’ve pointed out to some of the ground-station analysis folks. Haven’t heard a response yet.
      Citation: http://journals.ametsoc.org/doi/pdf/10.1175/JCLI4291.1
      Didn’t have a chance to see if they cited this paper or not
      Peter

      • Peter Sable: AAAND the statistical methods for finding structural breaks are flawed in the presence of non-white noise.
        Thank you for the link to Lund et al.
        ABSTRACT
        Undocumented changepoints (inhomogeneities) are ubiquitous features of climatic time series. Level
        shifts in time series caused by changepoints confound many inference problems and are very important data features. Tests for undocumented changepoints from models that have independent and identically distributed errors are by now well understood. However, most climate series exhibit serial autocorrelation.
        Monthly, daily, or hourly series may also have periodic mean structures. This article develops a test for
        undocumented changepoints for periodic and autocorrelated time series. Classical changepoint tests based
        on sums of squared errors are modified to take into account series autocorrelations and periodicities. The
        methods are applied in the analyses of two climate series.

        Like Lund et al, Kelly and O’Grada used autocorrelated noise as one of their models.
        Kelly and O’Grada did not cite Lund et al, and the reference lists of the two papers have surprisingly little overlap (surprising to me).
        All statistical methods are flawed. If Doug Keenan responds to every submission with the comment, like yours, “that method is flawed”, then he’ll never pay the reward. There needs to be some guidance about what noise/natural variation assumptions are considered “good enough”. My later post asserts that if the noise/natural variation model incorporates a period of about 950 years, then no statistical method is going to reject the natural variation hypothesis.
        Another note about Kelly and O’Grada, is that their “headline message” is that a proper change-point analysis reveals no change-points supporting the idea of any particular short (a few decades) “Little Ice Age”. “Only” one change point is supported by their analysis, namely at about 1880.

  35. the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.
    The result will depend heavily on whether the “random natural variation” is assumed to have a spectral density with power at a period of 950years. If that is specified, then the recent warming has almost for sure resulted from “random natural variation”.

  36. Doug:
    Are you going to tell us what percentage of the data sets had trends added to them?
    If not, knowing when we’re done is harder. Just want to know if you are going to tell us that or not.
    I’ve positively identified 21 data sets have added trends in a random sample of 500 of the data sets. Yeah, I know, slow start. Had a stupid bug in my code.
    Peter

    • Why would somebody who has the ability to detect these trends require to know how many trends he can see in the data? If he can see the trends so clearly as the alarmists claim they can, then they will instantly know how many of the series have trends!

      • Ha, just trying to save ten bucks.
        Seriously though if you are doing differential analysis it does help to know the frequency distribution of the underlying data. Cryptanalysis uses this all the time.
        Doug hasn’t responded, so I assume the answer is “sorry you figure this out”. I would like Doug to mention in public that any response concerning the data he gives via email will be publicized so that all may benefit..
        Peter

      • Peter Sable: Seriously though if you are doing differential analysis it does help to know the frequency distribution of the underlying data.
        In this context, “help” is a humorous understatement. If the data series spans a time less than one of the periods in the spectral density of the noise then you are just plain out of luck if you do not know the underlying frequency distribution. Even having data covering multiple periods may be insufficient, as shown by Rice and Rosenblatt in a paper in Biometrics about 1988.

  37. If we give $10 to every African and ask them to write 1000 selections of TRUE and FALSE each… then we are bound to prove that these trends can be detected.

  38. If a trend is so difficult to detect then by definition it is too weak to worry about. That is what I cannot get my head around! Why people care so much about this virtually undetectable thing!

  39. I am really surprised that no-one (unless I missed it) has mentioned the 300 year old Central Limit Theorem in this comment stream. An annual global temperature is not a measured temperature but an average of over a million daily min/maxs x 365 days x (say) 2000 stations. There are of course local correlations over short time and space in that population but enough effectively independent ones that the Central Limit Theorem will operate. That is, when annual differences are looked at as a population then the result will tend towards a Gaussian distribution, independent of the underlying distributions of the raw data population. (Reminder: summing or averaging is an information destroying process!).That is why every plot ever seen of global temperature index data is jagged with the annual differences being extremely difficult to distinguish from a Gaussian distribution.
    That of course does not preclude there being, additionally, a trend component which is a common component of all million+ raw measurements which is what AGW theory demands. But that is Doug’s challenge; can you repeatedly tease out the natural (Gaussian) from the trend?
    Actually a statistician with time on their hands could probably work out, by looking at raw data variance at shorter time and space frames, what the effective number of independent measurements is in that million plus raw data population and from that compute the expected variance of the year-to-year differences and from that…. the probability of detecting a trend.
    Maybe Doug has already done it that’s why he is confident to put $100k on the line!!

  40. The way this competition is done you can not rule out selection bias of the seeds used to create the data. By which I mean he could have generated 10,000s of datasets and then picked the sets that randomly had long term trends. This would make a statistical analysis exercise like this impossible.
    I am not saying that’s what’s been done but there is nothing here to safeguard against it.
    I way to possible protect against this it for the entries to be code that can analysis the 1000 trends and identify which contain a trend and which do not.
    After the competition closes the code to generate the series is verified and a new set of 1000 series is generated from randomly generated seeds. With the trend randomly added to some of them. Then the winner is the code that can correctly identify at least 90% of the series in the newly generated data. This way you can rule out any selection bias.

  41. The way this competition is done u can rule out selection bias of the seeds used to create the data. By which I mean he could have generated 10,000s of datasets and then picked the sets that randomly had long term trends. This would make a statistical analysis exercise like this impossible.
    I am not saying that’s what’s been done but there is nothing here to safeguard against this.
    I way to possible protect against this it for the entries to be code that can analysis the 1000 trends and identify which contain a trend and which do not.
    The code to generate the series is verified and a new set of 1000 series is generated from randomly generated seeds. With the trend randomly added to some of them. Then the winner is the code that can correctly identify at least 90% of the series in the newly generated data. This way you can rule out any selection bias.

  42. I don’t see the point of this contest.
    Suppose you have 4 dices, two fair, one loaded to have a “positive” bias and one a “negative” bias.
    Now suppose you have record for 1000 series of these dices, without the information of which dice produced it. This contest is basically about trying to guess that.
    Can it be done ? Obviously it depends on how much unfair dices are loaded .
    I guess the climate load is not enough to distinguish dices at the 900/1000 level.
    However, our climate problem is quite different. It is : “given the unique serie i have, which is trendy, what is the most reasonable to believe between
    H0) this trend is a produce of randomness and has no reason to keep going on [climate dice is fair]
    H1) this trend has some causes that didn’t disappear so it will keep going on [climate dice is loaded]
    For sure we cannot rule out H0 at 95 or even 90 % level. However H1 remains more probable (at may be 51 Vs 49).
    This of course do not provide any hint at the causes of this slow probable trend. Since this trend began several centuries ago, GHG are most probably out of cause …

    • All of the models assume that the data are accurate. A major problem is that of data coverage – it has changed over time. We would normally expect recorded temperatures to creep upwards over time due to the location in urban areas – there are much more buildings concrete, parking lots, airports, roads than there were 135 years ago. I any case the original question is not how temperatures increase with time but rather how temperatures increase with the addition of CO2 to the atmosphere by burning fossil fuels. We are using the wrong independent variable.

      • I any case the original question is not how temperatures increase with time but rather how temperatures increase with the addition of CO2 to the atmosphere by burning fossil fuels.

        I think this is the proper question, though we should also include all of the other possible sources of warming, even if we don’t think they exist.
        But, my premiss is that the main effect is a loss of night time cooling, not warming. I have been calculating the difference between today’s day time warming, and tonight’s night time cooling for all stations that measure at least 360 days a year(at 360 almost all stations are 365/366). And then averaging each of these stations, and there’s no sign of a loss of cooling. In fact since 1940 there’s a slight cooling, if you include measurement uncertainty the average of all years is 0,0F +/- 0.1F
        https://micro6500blog.wordpress.com/2015/11/18/evidence-against-warming-from-carbon-dioxide/

  43. I’m not sure the problem to be solved is clearly stated. Looking at the first few I find two interesting series:
    1) There is a very pronounced upward linear trend for the first half and then no trend for the second half. (Like the UAH satellite data). How should one reply to this.
    2) There is no trend for the first half, then a jump and then no trend for the second half. This will baffle a least squares fit – or a ranked sign test. (This type of behavior used to be very common in interest rates data. The series would jump every time the FED changed the rate).
    Doug is making a very clear point here. I don’t thing that you can run these series through a “trend detector” and just spit out the results. You will probably need to check each series individually.
    His critique on statistical methods used in climate modeling is scathing.

  44. Odd that anybody thinks this ‘challenge’ has anything to do with the study of nature. Nature might be operating by a quasi-random process, but is always limited by the laws of conservation of energy, and so forth. The ‘challenge’ has no such constraint. 1000 time series selected by unstated methods, from an unstated statistical generating method (or methods, there being an infinity thereof; one could easily produce every series from a different generator), with unstated types of functions added to an unstated fraction of those series … that’s not even a remotely interesting question in mathematics. Much less science.
    As I observed elsewhere, not a bad method to collect the entry fees and buy yourself a present.
    Digressing:
    Many comments have mentioned Nyquist frequency, almost all of them incorrectly. The Nyquist frequency is the most rapid cycle that can be detected without aliasing, and its period is 2*dt. For annual data, 2 years is the shortest period. A comment mentioned something ‘about length/5’. This isn’t Nyquist frequency/period. Rather, it’s a fair rule of thumb as to the longest period one can make reasonable statements about in a spectral analysis. For 135 years, that’s a cycle of 37 years. Hence talk of 30 year cycles solely from the data, is perhaps doable, though shaky. 50-60 year cycles are too long.

      • This isn’t Nyquist frequency/period.

        I’m the one talking about this, let me defend my comments.
        Robert, I believe Nyquist is symmetrical. A window of data such as the 135 year temperature record is missing both low and high frequencies – you can’t resolve anything useful above or below a set of 2 frequencies.
        Everyone knows that the high end cutoff of resolulution period is 2/sample rate. the literature on this is overwhelming, a simple google search gives bazillions of results, and everyone is subjected to marketing literature on their audio equipment.
        However, there has to be a low frequency resolution cutoff as well. What is it? Google is a complete fail on this. In fact low frequency signal processing isn’t really well studied AFAICT. Look at how bad Mann screwed it up…
        I haven’t created a formal mathematical proof, but I’ll try English.. With that entire window of 135 years, we effectively have one sample for low frequencies. By Nyquist we need two samples. Ergo, we can only resolve 67.5 years. I’ll also add that the math involved is pretty symmetrical, which also justifies my claims (short of a formal proof).
        The 5x sampling “rule of thumb” applies at both the low end and the high end. On the low end, it’s due to being unable to resolve multiple overlapping signals of slightly difference frequency and phase. I arrived at the number via Monte Carlo simulation and thinking about the problem a lot.
        At the high end, due to sampling error and filtering issues, you actually need better than Nyquist. For example most digital oscilloscopes start at 4x oversampling, the good ones 8x. Your CD player scrapes by with 2.2x, but when I still had good (young) ears I could hear the distortion in the cymbals due to the high slew rate in combination with high frequencies. That’s why pro audio is 96Khz to 192kHz (5x-10x).
        I’d really love if someone could find some formal work on low frequency resolution of a limited sample length. It’s just not well studied by signal processing type folks. At least, I can’t find a good reference. Statisticians, AFAICT, are completely ignorant of this entire issue…
        Finally, as to your comment on 30 year cycles. There are not (major, known) 30 year natural cycles, there are 30 year HALF cycles (as defined by signal processing folks), the actual cycles are on the order of 60-75 years. People get confused and think going from High to Low is a cycle. It is not. You have to go back to High to complete the cycle… I note Wikipedia is complete confused on this matter…. but good look editing anything climate related there… don’t trust anything you read that doesn’t have a graph where you can visually verify what a cycle is.
        Peter

      • I completely forgot to add another justification that also shows the symmetry in English instead of math.
        Nyquist was originally studying to see how many symbols could be transmitted per sampling period. Which is a different use of the term Nyquist than we use today. It’s called the Nyquist Interval:
        from: https://en.wikipedia.org/wiki/Nyquist_rate#Nyquist_rate_relative_to_signaling

        “If the essential frequency range is limited to B cycles per second, 2B was given by Nyquist as the maximum number of code elements per second that could be unambiguously resolved, assuming the peak interference is less half a quantum step. This rate is generally referred to as signaling at the Nyquist rate and 1/(2B) has been termed a Nyquist interval.”

        We have a cycle rate of B = 1/135 cycles per year (the length of the sample window). We wish to resolve a symbol (the underlying signal). The symbol rate we we can resolve is 1/(2*1/135) = 67.5 years.
        Of course I could have inverted the logic and be off by 4x. Then explain why you can’t see it on an FFT… Sometimes the numerical simulations are useful, for example in proving you didn’t screw up by a factor of 4. Unlike climate simulations, that try resolve 10x finer grain details…
        Also note the comment about “peak interference”. I’m pretty sure that’s sampling noise or other confounding signals…
        Peter

  45. The Cartesian mathematician will argue that everything is deterministic, and that any collection of N data points can be exactly described using an orthonormal model that spans the data domain and includes N contributing factors; the theoretical extension of the adage ‘it takes two points to determine a line.’
    Statistics is that branch of mathematics that provides tools for handling problems where the number of data points in a situation far exceeds the number of known factors.
    Statistics has absolutely no connection with causality, however. That issue is handled in the selection of the model, and is best facilitated by an understanding of the physics involved.
    Given a set of data points (values of a dependent variable paired with values of an independent variable), I could fit the set exactly with a Taylor series, a Fourier series, or any number of orthogonal polynomials. The choice I make will be influenced by what I understand the data to represent.
    If I expected a combination of periodic and secular (non-periodic) influences, I might even try a full Fourier decomposition and make a fit to the transformed data to separate the periodic and non-periodic functions.
    The determination of whether or not there is a truly random influence would involve fitting a subset of N-1 data points with the physically most appropriate model, and then checking whether that has any effectiveness in predicting the value of the missing data point. Comparing the predicted and observed values of each point for all N data points and then examining this new set for an r^2 correlation will tell me whether there not there is randomness present.

    • Deterministic does not mean predictable.
      This was the key insight of Ed Lorebz in his foundational 1961 paper “Deterministic Nonperiodic Flow”.
      I wonder if Doug is using Lorenz’ DNF61 code. That would be very cool.

  46. It should first be noted that in the IPCC Third Assessment Report – Chapter 14: Advancing Our Understanding, the following statement appears in sub-section 14.2.2.2:
    “In sum, a strategy must recognise what is possible. In climate research and modelling, we should recognise that we are dealing with a coupled non-linear chaotic system, and therefore that the long-term prediction of future climate states is not possible. The most we can expect to achieve is the prediction of the probability distribution of the system’s future possible states by the generation of ensembles of model solutions. This reduces climate change to the discernment of significant differences in the statistics of such ensembles. The generation of such model ensembles will require the dedication of greatly increased computer resources and the application of new methods of model diagnosis. Addressing adequately the statistical nature of climate is computationally intensive, but such statistical information is essential.”
    Now go win the $100,000!

    • Thanks, Doug. However, I have a problem with your claim:

      In essence, the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.

      However, rather than requiring that people can demonstrate it via “statistical analysis”, you have a problem which requires that we get an “A” (90% correct). I know of no statistical analysis which requires this kind of certainty. Here’s what I mean.
      Suppose for the sake of argument that half of your data has an added trend. If I could identify say 60% of the trended ones correctly, that would have a p-value of 2.7E-10. This is far more than is needed to “demonstrate, by statistical analysis”, that we can tell the difference between trended and untrended datasets.
      Of course, your test is even harder because you haven’t said how many are trended and how many are not. As a result, you are demanding much, much, much, much, much more from your challenge than is required to distinguish a series with a trend from one without a trend.
      As a result, I fear that as it stands your challenge is useless for its stated purpose. It has nothing to do with whether or not we can distinguish trended from untrended data via “statistical analysis” because the threshold you’ve set is far above what is needed by statistical analysis.
      However, none of this means that I’ve given up on your puzzle. I estimate I can get about 80% accuracy at present, more for subsets. And I just came up with a new angle of attack, haven’t tried it yet. I suspect that this problem is solvable, I just have to get more clever.
      My best to you, and thanks for the challenge,
      w

  47. This competition is fraudulent. What Keenan says on his web page in the 3rd paragraph is “I am sponsoring a contest: the prize is $100 000. In essence, the prize will be awared to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation. ”
    In fact this can be demonstrated quite simply by considering the decadal variation in the temperature anomaly. Every single decade since 1950 has been warmer than the previous decade. The probability of getting such a sequence by chance is less than 2%. However, Keenan invites us to solve a mathematical problem without providing any evidence whatsoever that it has any bearing on global climate. In fact, if you dissect his fallacious logic, it goes like this:
    1. Assume that climate is random
    2. Devise an insoluble problem involving random numbers and trended vs trendless process
    3. conclude that since the problem is unsolvable, assumption (1) must be true

      • I think you and he are saying slightly different things:
        You are saying that periods of aggregate multi-decadal temperature rise have been seen prior to assumed AGW “start time” (1950) of the same order of magnitude as the rises since 1950. Some have been as long as 50 years. You are right and I have never seen an effective rebuttal. However within a substantial multi-decadal rise there will usually be one or more decades (if you choose the decade) where it didn’t rise, interrupting the contiguous sequence on which he bases the claim of very low probability
        He is saying that, choosing decades starting 1950, there have been 5 sequential “up” decades. If the system was largely random with roughly 50% chance of up or down and no autocorrelation that is indeed a 2% chance event. He is right. However, the year on year variance is quite large in relation to the year on year trend change so I suspect I could start the decades at slightly different years and easily produce a several possible down decades which would interrupt the 5 UPs and become, say a mixture of 4 UPs 1 Down in some sequence which is a much more common occurrence. However I would then have to try every year as a start year and see how many showed that pattern to calculate the equivalent to the 2% chance for each. And then weight them by the (equal) probability of starting the decade in that year to calculate overall the probability of the observed result, independent of the start year. It would be higher than 2% but I don’t know how high. It is Sunday afternoon after all.
        Jonathan Paget

    • I told already that this competition is pointless. It doesn’t make it fraudulent and it doesn’t aim at proving that climate is random. This competition aims at proving that we just cannot be reasonably sure ( at 90 % ) that the climate dice is biaised toward warming, since we are not able to distinguish between loaded dice.
      Actually, remember that a canonical definition of randomness is “Meeting of two independent causal series”, and indeed there are two (and even more: I leave it to you to name some of them that are known) independent causal series in climate . So, no discussion, there is randomness in climate, period. No assumption here, simple fact.
      “Every single decade since 1950 has been warmer than the previous decade. The probability of getting such a sequence by chance is less than 2%. ”
      NO, In long enough a random process the probability of getting such a sequence by chance is 1 (100 %). It had necessarily to happen some days. And temperature series are indeed a long enough for this to apply.
      So “less than 2%” refers to the probability of you being alive in an era when you can say that ? But we KNOW for sure that you live in our present era, as opposed to many others era when people couldn’t have say that . The probability is 1, again, not “less than 2%”.

  48. From the GISS global anomaly, take the average of each decade centered on whole decades,
    meaning 1885-1894, 1895-1904 and so on with the last decade being 2005-2014.
    When you do this you get:
    1890, -0.273
    1900, -0.224
    1910, -0.353
    1920, -0.259
    1930, -0.185
    1940, 0.031
    1950, -0.041
    1960, -0.028
    1970, -0.018
    1980, 0.147
    1990, 0.298
    2000, 0.510
    2010, 0.652
    As you can see, every decade since 1950 is warmer than the previous one. In fact in the entire series there are only two decades that buck the rule, 1910 and 1950.
    Jonathan Paget speculates without any actual evidence that you might get quite different results by shifting the central year. In fact the only effect of doing so is that for a few choices, 1950 is then warmer than 1960. The sequence from 1960 onwards remains the same, even at the cost of omitting some data from the past decade. If anyone is interested I can paste the code. (python).
    paqyfelyc objects that if you try long enough, tossing an unbiased coin will eventually turn up the same sequence. However, that is not the problem at hand. We are not given an infinite number of tries. If you have 13 coin throws and asked to conclude, based on these coin throws, whether the coin is biased or not, what is your conclusion as a statistician?

    • And this is exactly what you would expect based on the knowledge that global average temperatures have been rising ever since their low point in 1650. The biggest question is: Will today’s Modern Warm Period peak in 2000-2015 be the maximum between 1650 and 2550’s Modern Ice Age?
      Or will today’s Modern Warming Period max out at the next 66 year short cycle peak in 2070, or 2140?
      When do we begin going back down to the next ice age?

    • Hello TomP
      Thank you for the courteous response. I was reacting to what I initially saw was the faux precision of the 2% figure in the light of some ambiguity around start dates, decadal starts, etc etc and that really quick “non rigorous” probability analyses sometimes have stings in tails.
      However,I’ve learnt pragmatically that truth is almost never further illuminated by calculating probabilities beyond one decimal point and often not even TO the first decimal place!! So I readily concede that the notion that GISS Global decadal surface anomaly change from 1950 to 2014 is generated from a purely random (Gaussian) “walk” of decadal steps would be a VERY LOW probability hypothesis and any rational analyst would look for a different starting point.
      There clearly is an upward trend in decadal changes that has a persistence in time beyond that which pure chance can sensibly account for.
      So, while I have your attention, can I ask your critique of the following elementary logic, which is in principle a simpler approach to the same question?
      The standard deviation of the annual anomaly changes 1950 to 2014 is 0.14 degrees. The time period n is 64 years (convenient for mental arithmetic!). A Gaussian random walk with 0 mean and n steps will produce a distribution of end points with 0 mean and standard deviation = root(n) x 0.14 = 8 x 0.14 = 1.12 degrees.
      The actual increase in anomaly in the 64 years was 0.89 degrees. What is the probability that the end point meets or exceeds that actual result?
      Brief pause whilst consult old fashioned Normal Distribution tables
      Answer….. 24%
      Presumably the answer rests in the choice of time periods varying between annual (me) and decadal (you). But isn’t my annual approach equally valid and how would we rationalise the difference? By the way I would characterise 24% as LOW not VERY LOW!
      regards
      Jonathan

  49. talking about sports – not that aqquainted with baseball –
    but FIFA, formula 1, skiing …
    each season brings new regulations to hold suspence and weld public to the view:
    really ‘self regulating’ systems.
    Regards – Hans

  50. leaves the question what’s FIFA on this blue environmantled system –
    lives persistance vs. Entropy,
    real diesels stronghold vs. EPAs imagined *pony farms, …
    Hans
    *worldwide ressorts

Comments are closed.