
Ross McKittrick writes via email:
A UK-based math buff and former investment analyst named Douglas Keenan has posted an intriguing comment on the internet. He takes the view that global temperature series are dominated by randomness and contain no trend, and that existing analyses supposedly showing a significant trend are wrong. He states:
There have been many claims of observational evidence for global-warming alarmism. I have argued that all such claims rely on invalid statistical analyses. Some people, though, have asserted that the analyses are valid. Those people assert, in particular, that they can determine, via statistical analysis, whether global temperatures are increasing more that would be reasonably expected by random natural variation. Those people do not present any counter to my argument, but they make their assertions anyway.
In response to that, I am sponsoring a contest: the prize is $100 000. In essence, the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.
He would like such people to substantiate their claim to be able to identify trends. To this end he has posted a file of 1000 time series, some with trends and some without. And…
A prize of $100 000 (one hundred thousand U.S. dollars) will be awarded to the first person, or group of people, who correctly identifies at least 900 series: i.e. which series were generated by a trendless process and which were generated by a trending process.
You have until 30 November 2016 or until someone wins the contest. Each entry costs $10; this is being done to inhibit non-serious entries.
Good luck!
Details here: http://www.informath.org/Contest1000.htm
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
I sense this is a followup to Keenan’s roil with MET where he challenged MET’s AR1 model to make the point MET’s model was incapable of determining statistical significance in the temperature series. He proved to MET (and Parliament) he was right. This challenge and prize is essentially an extension of what he already demonstrated. This is not a challenge I would spend time on trying to win the prize as I am confident, once again, he is right.
This should be interesting.
This could be interesting, and a real test of my wiggle matching skills or lack thereof. Besides that I have always been a bit of a gambler.
I have not understood what Mr. K. said. There is a thing which I know : it is getting warmer.Did you
notice?
No.
I noticed that too when ever I’m making a Sheppard’s Pie.. Every time I open the oven door….voila, my face gets hotter ! Must be Glo.Bull warming eh !!! snarc….
🙂
I have not noticed myself, but going into Winter in Canada, it will drop 10C in 12 hours by the weekend. I just wish it was o.85C cooler so my children would not die before the age of twenty.
lol
Temperature is not random, just an output to an incomprehensible number of inputs. The contest proposed is not a solvable puzzle (in my lifetime at least) and Ross knows this but it is the antithesis of the null hypothesis. If you cannot prove randomness, then prove non-randomness, a brain tease if you will. Unsolvable!
All the energy in the Earth surface/atmosphere system comes from the Sun (a small amount comes from the continuing very slowly cooling of the interior – and all the atoms in the Earth system contain an unknown amount of energy bound up in them), but …
Energy In (Sun) – Energy Out Delayed (OLR and Albedo) = Energy surface/atmosphere = Temperature
It is NOT an easy thing to calculate/measure these components (let’s say impossible) and the resulting temperature will therefore appear random. Even the Delay component has never even been guessed at by anyone. It can actually be approximated in hours.
Climate science assumes “Energy In” is constant (give or take a small solar cycle which has no impact anyway) and “Energy Out (Albedo)” hardly changes at all except when CO2 melts ice over a long period of time, and “Energy Out (OLR)” only varies because of CO2/GHGs. They don’t even think about the “Time” dimension inherent in the equation at all. There is no real climate models, there is only a way of “thinking” about the equation.
What would be the temperature at the surface without the Sun? 3.0K or so one could guess.
Instead, it is 288.0K and, in Earth history since the end of the late heavy bombardment, this value has only varied between 263K and 300K. What is the main determinant of why it has only varied between these two values (answer, the Sun has warmed some over the eons but it is really the Albedo component that has been the driver of a Hot Earth or a Cold Earth. Why would it be any different on shorter timescales.
” What is the main determinant of why it has only varied between these two values (answer, the Sun has warmed some over the eons but it is really the Albedo component that has been the driver of a Hot Earth or a Cold Earth. Why would it be any different on shorter timescales.”
Every night the Sun goes down, and out of the tropics the length of the changes, and as the day shorten the temp drops daily. And on clear low humidity nights, it drops till morning, and over a year an average of all stations that record for a full year, when averaged since 1940 it cools slightly more than it warms the prior day.
The energy balance at these stations shows no warming trend.
Has anyone found out where people can go to submit answers? I was just goofing off, and I came up with a list as a guess at an answer. I don’t know if I’d want to spend $10 given how quickly I put the list together, but even if I did, I don’t see how I’d go about doing it.
Yes! #(:))
HOW TO ENTER THIS CONTEST:
Hurrah! #(:))
(I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)
Wait, what? The rules specifically say entries must be accompanied by the entry fee. In what world does that mean people must e-mail him then wait for a response telling them how they can submit an entry fee? By definition, that means the method for entering the contest hasn’t been disclosed.
This does not matter. I say this because Paul Ehrlich is a Stanford professor and environmentalist celebrity and not a retired shoe salesman.
So, basically, being proven wrong means nothing in the scientific and scientific celebrity community. Which means that the community does not exist on a functional level, but there you go.
my analysis shows the opposite. that the patterns we find and try to explain in terms of cause and effect can be generated by randomness because nature’s signals contain long term memory and persistence – what has been referred to by the IPCC as “inherent chaotic behavior”.
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2689425
Start with an equation derived from conservation of energy. Calibrate it to the measured average global temperature (AGT) trajectory by maximizing the Coefficient of Determination. The equation is a 97% match to 5-year smoothed AGT measurements since before 1900. http://agwunveiled.blogspot.com
Dear Ross,
I would like to direct your attention to a multitude of previous campaigns like this one, and their rather poor performance. (I’ll leave it like that, unsupported by facts.)
And be bold enough to suggest a better approach. In the time of the net, to get good attention. You need there to be something; for the supporters to do. Something that shows the support, and makes both it and the challenge newsworthy.
So I suggest you make it so people can fund others tries, let’s say I could fund M. Mann’s try at the 100k$, and you could send him an email saying he had one free try. Copy that to some news org. You do that every time somebody gets sponsored, and for decency and news worth, add that 10$ to the prize money. Get one of them counters here at WUWT, and a “one click donate” you’re “favorite” climate change researcher or add other. The possibility for spinn and press are huge. (M. Mann has so far “collected” xx$ for: “The search for certainty” ).
And who knows, it might grow on you.
You guys are a little late to the party here, climate as a random walk (or not) has come up over many decades. Yes, the proposed puzzle is unsolvable, but proving that climate is not a random walk is not.
https://tamino.wordpress.com/2010/03/11/not-a-random-walk/
Does that mean someone named after a pair of sunglasses can solve this?
James, you will have to do something, MUCH, actually, to:
1) qualify your expert witness, Grant Foster, a.k.a., “Tamino;” and
2) to rehabilitate his credibility
if you want us to take anything he writes seriously.
Re: Foster’s Qualification as an Expert:
“Rahmstorf et al {Grant Foster} (2012) assume the effects of La Niñas on global surface temperatures are the proportional to the effects of El Niño events. They are not. Anyone who is capable of reading a graph can see and understand this.
Bob Tisdale here: http://wattsupwiththat.com/2012/11/28/mythbusting-rahmstorf-and-foster/
I’ll take Bob Tisdale’s analysis over “Tamino’s” any day.
Your “expert’s” credibility is also questionable:
I find it difficult to believe that something so obvious is simply overlooked by climate scientists and those who peer review papers such as Rahmstorf {and Foster} (2012). Some readers might think the authors are intentionally being misleading. ***
The sea surface temperature records contradict the findings of Rahmstorf et al {Foster} (2012). There is no evidence of a CO2-driven anthropogenic global warming component in the satellite-era sea surface temperature records. Each time climate scientists (and statisticians) attempt to continue this myth, they lose more and more…and more…credibility.
Bob Tisdale (Ibid.)
And this is just ONE example. Typical, according to WUWT commenters on this thread: http://wattsupwiththat.com/2014/10/12/is-taminos-open-mind-dead-or-just-in-an-extended-pause/
So it’s fair to say that you believe the best model for climate is one which allows for unmitigated, unpredictable swings to unbounded extremes? You must be a huge proponent of extreme adaptation and mitigation spending.
Science does not work by proof. It works by falsification, which lends evidence to support conclusions. To suggest that “proving climate is not a random walk is not [unsolvable]” is philosophy.
Science does not, mathematics does. The proposed problem is purely mathematical with no physical basis. QED.
There are some who disagree with this premise – that science can’t prove anything. Some would be wrong. Science is based on experiments and observations which can arrive at flawed conclusions that are well accepted until some time in the future when the flaw is discovered.
The basic test of any scientific theory is whether or not it can be falsified. If there is no proposed method to falsify a theory then is it not truly a scientific theory. Experiments and their measurements can only disprove a theory or be consistent with it.
Karl Popper demonstrates quite clearly in his book, “The Logic of Scientific Discovery,” why science can’t prove anything. Give it a read and I trust it will make sense.
Yes, the proposed puzzle is unsolvable, but proving that climate is not a random walk is not. https://tamino.wordpress.com/2010/03/11/not-a-random-walk/
More “stats folks need to talk to signal processing” folks problems in this writeup. The stats folks are making assumptions about what the noise likes like at frequencies smaller than 1/135 (or more accurately about 4/135 due to Nyquist).
Since the data simply doesn’t exist for > 135 years they are extrapolating. That’s also called guessing, and in this case not even educated guessing. In fact it’s bad guessing, there’s lots of papers showing multi-hundred and thousand year oscillations.
Also the author ignores the possibility that the “bounded” temperature may have bounds far outside the last 135 years. In fact the proxy records seems to indicate as much.
Also I find the AR model to be just a lame way of doing interpolation in the frequency domain, when, in fact, you could just translate the original signal to the frequency domain and do fairly basic, statistically sound interpolation and smoothing (albeit on complex numbers), and then apply that result to white noise to shape it. Haven’t found a good paper on that, I’m a little bit in wild territory on this idea…
I’m going to look at the challenge data in the frequency domain. This should be fun, the author of the above challenge is a statistician and probably doesn’t talk to signal processing folks. I might catch something and make some cash. You never know till you try…
Peter
How sad for my dreams of riches. My attempt to recreate Doug’s process showed a clear difference in a complex plot of an FFT, but Doug Keenan’s did not 🙁 :-(. I got quite excited for a few hours.
My pinknoise generator is not generating the same phase relationships as Doug’s… Doug’s are highly regular and uniform and look exactly likely that of a ramp we are looking for…
Back to the drawing board.
Peter
the prize is $100 000. In essence, the prize will be awarded to anyone who can demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.
Demonstrations, like proofs, depend on assumptions: make the appropriate assumptions, and the results follow. Anybody can experiment a lot, and then submit the demonstration, out of the many, that satisfies the requirement. He must have some restrictions on what assumptions are acceptable. Otherwise lots of people will win the prize. Note the wording: the increase in global temperatures is probably not due to random natural variation; a conclusion like that depends on the prior probability that the increase is not due to chance. What prior distributions on that outcome are permitted?
While the series were generated randomly, were they filtered in a non random fashion? For exmple, based on their trend? This would effectively stack the deck if 3 different filters were used to match the 3 different sets of results.
The challenge is to use their statistical prowess to separate out the inherent noise of the climate system from the “anthropogenic signature” that they claim has contributed less than 1 degree of warming over the last 100+ years.
The details of the challenge are described at the website. He use a Global Temperature model to generated 1000 random time series that have no trend for 100+ years, then added a trend to a subset of them that averages out to >+1 degree or <-1 degree over the length of the time series.
If they can correctly identify at least 900 of those time series as having a trend introduced or not, they win.
KTM, how would the contest judges know whether the submitted entry had not in fact been selected out of many tries?
You have not plotted the data have you… also AR models DO produce trends. That’s the whole point of the exercise. There are naturally occurring trends. How do you know if we have a natural trend or an aC02 caused trend? a CAGWer can get a $100k if they know how to do this. If they don’t, they have no business blaming aC02…
I’ll also note in passing this also applies to ANYONE correlating to temperature. For example the Solar folks… most people have a real hard time with WE DON’T KNOW.
Peter
I can’t find out how to enter the contest. There is no email address or contact information.
How do we enter? I developed an approach that I like.
Here’s how! #(:))
HOW TO ENTER THIS CONTEST:
Hurrah! #(:))
(I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)
http://www.informath.org/AR5stat.pdf
Wow, this paper is basically a summation of what I spent the last 6 months futzing around with and have posted brief glimpses. Thanks for publishing this article Anthony, good stuff. I might be inspired to attempt to finish the work I describe above on what a confidence interval for a trend of ~1/f noise should look like.
Unfortunately the paper is mostly in english and not detailed technical..so replicating it will be hard. The author doesn’t propose a correct model, only points out the one in IPCC AR4/AR5 is wrong.
Peter
While Keenan does not state his preferred model in the paper (although he alludes to it on page 5), he has presented it in his testimony before the UK Parliament in Lord Donahue’s inquiry of MET’s model and its ability to show statistical significance in the temperature series.
Keenan’s preferred model in this case is a driftless ARIMA(3,1,0). Met and the IPCC use an ARIMA(1,0,0) model that has been shown to drift as it lacks an integrative term and is too narrowly scoped in its autoregressive term.
I do not prefer or advocate using any particular statistical model, nor have I ever done so.
Doug, my apology for poor wording. I recalled that you offered ARIMA(3,1,0) as a favored model against MET’s trending model as stated your guest post at Bishop Hill:
That is the case I made reference to. I realize that what model is chosen for a given time series is dependent upon testing with AICc, BIC, ACF, PACF lag correlations and more. Thus, no, you would not wholesale advocate using any particular model and for good reason. I would not intentionally make the mistake in saying you would.
Hank, I appreciate that your comment was an honest one. The quote you include says that a driftless ARIMA(3,1,0) model is far better than the IPCC model, and so we should reject the IPCC model (which was the conclusion that I was seeking). The quote does not say that the ARIMA(3,1,0) model is any good in an absolute sense. I really have no idea what model we should accept. For a long discussion of all this, see my critique of the IPCC statistical analysis (linked on the contest web page, in the section on HMG).
Information and entropy. The more entropy the less information. So yes randomness is lack of information. Any recent (post ’60s – maybe earlier) study of thermo would give the equations.
About submitting a contest entry, this can be done by sending me an e-mail. The contest web page has been revised to state that.
@egFinn That is an interesting idea, and might well be worth pursuing. I want to think about it some more. Kind thanks for suggesting it.
Any finite time series, any finite shape can be produced by a random process. For this reason one cannot distinguish between random and not random sequence of numbers. I think I do not understand the premise of the contest.
I finally get to reply to the other Peter…
In statistics of a time series and signal processing, when the underlying causes of a signal are unknowable, the null hypothesis against “do I have a correlatable or otherwise useful signal” is to check to see if the signal is significantly different from random noise of the same spectrum, because when you have underlying causes from large numbers of random variables, it’s in effect a random process.
The null hypothesis for “is there a significant signal in the global temperature record” is to test against a Monte Carlo simulation of a random process of equivalent spectrum in order to determine if it’s just due to variation in the variables you don’t know about. Some example of “known unknowns” are solar variation, stored heat oscillations in the oceans, oscillations in ice coverage between the arctic and antarctic, PDO, etc, all stuff that’s been speculated about for years but aren’t measurable due to all the confounding other variables or lack of data. You can only make a valid conclusion about the entire group of unknown variables (aka the global temperature), the individual components aren’t distinguishable. Then of course there’s all the unknown unknowns…
Here’s a paper that uses this technique to find that indeed, the ENSO signal is significantly different from random noise. Also note there’s no other signal there…
http://journals.ametsoc.org/doi/pdf/10.1175/1520-0477%281998%29079%3C0061%3AAPGTWA%3E2.0.CO%3B2
You can follow the references in that paper if you like to see who originated this general idea and prove its validity. Alas I keep getting stuck behind paywalls and I gave up trying to find the original paper. Such a tragedy that general human knowledge that taxpayers paid for is not available to the common man. I’m also sad the author of this contest didn’t cite the original paper in his letter to his government.
Compounding this problem is that the lower frequency the signal compared to your sample, the more likely it’s noise. You are also extrapolating, because you don’t know what the noise is at frequencies lower than the inverse of ~1/5 the length of your sample (Nyquist). There’s no lower frequency than the trend drawn through your data…. That’s why I laugh every time I see a trend line drawn through temperature data. Including Lord Monckton’s 18yr9mo “no trend” graphs (though he’s just hoisting the warmists by their own faulty petard).
Peter
Sorry, I forgot to relate this to what the challenger did.
The challenger threw down the gauntlet and said “let’s see if you can invalidate the null hypothesis that the 135 years of temperature is just noise”, at the 90% confidence interval (100 out of 1000 have real trends). If you can answer his challenge, then you can also prove that the temperature record has a statistically significant trend. (let’s not argue whether the trend is drylabbing at this point, it doesn’t matter for this particular statistical argument.).
Personally I accept the challenge not because I don’t agree with the challenger’s hypothesis, but rather he’s a statistician and I’m a signal processor, and I might be able to find a flaw in his generation of random signals ;-). I could use $100k to fund that around the world surf trip I always wanted to take.
Peter
And, one more time (well, Mr. Keenan — I didn’t read your 11/19, 0419 post until a few seconds ago and… I went to a lot of trouble, so… I’m getting my posting money’s worth, heh) — just for the convenience of WUWT commenters:
HOW TO ENTER THIS CONTEST:
Hurrah! #(:))
(I used his “contact me” info. on McKitrick’s website and e mailed him yesterday evening — and he answered me!!)
Addendum: See Douglas Keenan’s post at 0419 today — he put entry info. on the contest site.
In Excel with years 1880; 1851 … etc. down column A and a seed number* in Cell B1, this formula:
=B1+(RAND()-0.5)*0.15
entered in Cell B2 copied on down and graphed out, generates a curve that often looks darn similar to GISS or any other world temperature timeline graph.
*The anomaly value from the base for the first year.
@Keenan
I wonder if you have a way to find the trended/non trended series without knowledge of the process.
And would it be possible even with knowledge of the process?
I like the idea of the contest, it is funny and interesting for the coupling to the real world.
Dear Mr. Ferdinandsen,
Try e mailing Douglas Keenan here: doug.keenan@informath.org.
Best wishes for a successful contest entry! (oh, boy, do I admire all of you who can even make a reasonable attempt to do that! — I’d have to take down about 5 books and study for months….)
Janice
I have occasionally, over the past several years, wondered about what may be similar to the same thing that Svend Ferdinandsen seems (to me) to suggest. Sorry in advance Svend Ferdinandsen if I misinterpreted your comment.
Can one determine if some sets of number are times series at all if one is only given the sets of numbers without any reference to what/how the numbers are derived or measured from or whether they are times series? I guess that depends on a basic question, do time series have features that uniquely and unambiguously distinguish them as time series without knowing they are time series?
The question interests me in a ‘dectective’ kind of perspective.
John
Oops, in immediately above comment ‘John Whitman November 19, 2015 at 12:40 pm’ I meant ‘detective’ in the last line, not ‘dectective’
John
@Merovign: Great shot! I get it.
And according to the Essex et al paper, average temperatures – because they are numbers, they can be averaged – can have no physical meaning.
If I were to give a number to a thousand places such as 2.03048154248…. and which never repeated itself and there was never any clue what the next number would be, would that then be a random number? It would appear that way. However if I then told you that this number is merely pi with all numbers decreased by 1, would it still be a random number?
(pi = 3.14159265359…)
Doug (Keenan), first, thanks for an interesting challenge. I agree with you that the generally used statistics are wildly deceptive when applied to observational data such as global temperature averages. However, it is not clear what you mean when you say that the datasets were generated via “trendless statistical models”.
From an examination of the data, it appears that you are using “trendless” to mean a statistical model which generates data which may contain what might be called a trend, but the trends average to zero. Your data is strongly autocorrelated (as is much observational data). Random “red noise” data of this type contains much larger trends, in both length and amplitude, than does “white noise” data.
Using a single-depth ARMA model, your average AR is 0.93, which is quite high. Data generated using high-AR
But then, without a “bright-line” definition of whatever it is that you are calling a trend, it’s hard to tell.
Next, “trend” in general is taken as meaning the slope of the linear regression line of the data. Viewed from this perspective, it is clear that ANY dataset can be decomposed into a linear trend component and a “detrended” residual component.
Since from this perspective all datasets contain a trend, it’s obvious that your challenge (determine which of these contain an added trend) is very possibly not solvable as it stands. Fascinating nonetheless.
You may have given your definition of a “trend” and a “trendless statistical model” elsewhere, in which case a link (and a page number if necessary) would be greatly appreciated.
A final question—were the trends added to random datasets, or to chosen datasets?
Thanks for the fun,
w.
PS—using only 135 data points and asking for 90% accuracy? Most datasets are monthly, so they are on the order of 1500 data points. Next, we don’t need 90% accuracy. We just need to be a measurable amount better than random, whatever that might be. A more interesting test would be 1000 datasets from the SAME “trendless statistical model” with a length of 1500 data points or so, with half of the thousand having a trend added. That would let us compare different methods of investigating the problem.
Willis,
Taking your points in turn….
A trendless statistical model is a statistical model that does not incorporate a trend.
AR coefficients should only be calculated after detrending, otherwise they will tend to be too high. Calculating the AR(1) coefficient using your method on HadCRUT spanning 1880–2014 (135 years) gives 0.92, which is essentially the same as what you got for the simulated data.
There is a standard definition of “trend” in time series analysis. In particular, trends can be stochastic, as well as deterministic. The standard reference is
Time Series Analysis by Hamilton (1994); for trends, see chapter 15.
Regarding your claim that “all datasets contain a trend”, that is not true. Rather, no statistical data set contains a trend. Instead, statistical models of a data set may or may not contain a trend.
About your final question, the contest web page should have said that the series were randomly-chosen. I have revised the page. Kind thanks for pointing this out.
Douglas J. Keenan November 19, 2015 at 3:21 pm
Can a trendless statistical model produce trended data? Depends on your definition of “trend”.
OK.
I’m afraid that doesn’t help, as I don’t have the text you refer to. Perhaps you could define for us, in an unambiguous way, exactly what you mean by a trend, because in this case, I’m more interested in what YOU call a trend. This is particularly true since below you say that no dataset has a trend … if so, then what are you calling a trend? For example, you say:
But you also say:
If that is the case then how can you possibly add a trend to a series, as in your quote above?
Thanks, I’d assumed that, just wanted to check.
I don’t think I can identify the trended data with 90% accuracy. However, to my surprise it is possible to distinguish between trended and untrended data at least part of the time. You giving any prizes for say 60% accuracy?
Many thanks,
w.
And an additional complication that statisticians IMHO are ignoring: There are trends from frequencies smaller than 1/data_length. They are there in the real world, they show up as a trend in any subsample of a data set, but AR models won’t generate them because the AR models generate no signal at frequencies below 1/data_length (it’s actually a rolloff curve across frequencies starting at some constant k/data_length).
Examples of some trends that AR models wont’ generate on a 135 history of temperature: The alleged year 1000 cycle, which can be seen in some proxy records but we don’t have accurate magnitude of. We are pretty sure it’s there though. There are also cycles between 100 and 1000 years that show up in proxy records. They’ll show up as trends, but if you make an AR model with 135 years you aren’t generating them.
The slope of the trend you are generating has a distribution you aren’t directly controlling for. IMHO you are guessing. Try running a Monte Carlo simulation and generate thousands of AR models and generate a histogram of the trends. Then tell me why that histogram is an accurate representation of real world trends. Given the fact that you aren’t generating trends that are greater than 1/data_length you are likely underestimating the width of that distribution.
You can also test this by generating 8x length of the original data length with your AR model and then taking the datalength from the middle of that set of data. You’ll find that magic number of 8x maximizes the width of trend distribution as compared to 4 or 16.
Peter
All data sets contain a trend Willis, because trend has nothing to do with the data set. It is something that is defined by a statistical mathematics algorithm.
So it works on any data set, regardless of the nature of the numbers in that data set.
You could pick up a Sunday news paper, and start from the front top line, and read through, and simply collect every number found anywhere and you have your data set.
Apply the stat algorithm, and it will give you the trend for that data set.
G
It would be interesting to see which of the very dogmatic, often abusive AGW aficionados DOESN’T have enough confidence in their understanding of the maths and physics involved to stump up the $10 and enter this contest…
Any suggestions?