
Guest essay By Kip Hansen
INTRO: Statistical trends never determine future values in a data set. Trends do not and cannot predict future values. If these two statements make you yawn and say “Why would anyone even have to say that? It is self-evident.” then this essay is not for you, you may go do something useful for the next few minutes while others read this. If you had any other reaction, read on. For background, you might want to read this at Andrew Revkin’s NY Times Dot Earth blog.
I have an acquaintance that is a fanatical button collector. He collects buttons at every chance, stores them away, thinks about them every day, reads about buttons and button collecting, spends hours every day sorting his buttons into different little boxes and bins and worries about safeguarding his buttons. Let’s call him simply The Button Collector or BC, for short.
Of course, he doesn’t really collect buttons, he collects dollars, yen, lira, British pounds sterling, escudos, pesos…you get the idea. But he never puts them to any useful purpose, neither really helping himself or helping others, so they might as well just be buttons, so I call him: The Button Collector. BC has millions and millions of buttons – plus 102. For our ease today, we’ll consistently leave off the millions and millions and we’ll say he has just the 102.
On Monday night, at 6 PM, BC counts his buttons and finds he has 102 whole buttons (we will have no half buttons here please); Tuesday night, he counts again: 104 buttons; on Wednesday night, 106. With this information, we can do wonderful statistical-ish things. We can find the average number of buttons over three days (both mean and median). Precisely 104.
We can determine the statistical trend represented by this three-day data set. It is precisely +2 buttons/day. We have no doubts, no error bars, no probabilities (we have 100% certainty for each answer).
How many buttons will there be Friday night, two days later?
If you have answered with any number or a range of numbers, or even let a number pass through your mind, you are absolutely wrong.
The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.
But, you might argue, the trend is precisely, perfectly, scientifically statistically +2 buttons/day and two days pass, therefore there will be 110 buttons. All but the final phrase is correct, the last — “therefore there will be 110 buttons” — is wrong.
We know only the numbers of buttons counted each of the three days – the actual measurements of number of buttons. Our little three point trend is just a graphic report about some measurements. We know also, importantly, the model for the taking the measurements – exactly how we measured — a simple count of whole buttons, as in 1, 2, 3, etc..
We know how the data was arrived at (counted), but we don’t know the process by which buttons appear in or disappear from BC’s collection.
If we want to be able to have any reliable idea about future button counts, we must have a correct and complete model of this particular process of button collecting. It is really little use to us to have a generalized model of button collecting processes because we want a specific prediction about this particular process.
Investigating, by our own observation and close interrogation of BC, we find that my eccentric acquaintance has the following apparent button collecting rules:
- He collects only whole buttons – no fractional buttons.
- Odd numbers seem to give him the heebie-jeebies, he only adds or subtracts even numbers of buttons so that he always has an even number in the collection.
- He never changes the total by more than 10 buttons per day.
These are all fictional rules for our example; of course, the actual details could have been anything. We then work these into a tentative model representing the details of this process.
So now that we have a model of the process; how many buttons will there be when counted on Friday, two days from now?
Our new model still predicts 110, based on trend, but the actual number on Friday was 118.
The truth being: we still didn’t know and couldn’t have known.
What we could know on Wednesday about the value on Friday:
- We could know the maximum number of buttons – 106 plus ten twice = 126
- We could know the minimum – 106 minus ten twice = 86
- We could know all the other possible numbers (all even, all between 86 and 126 somewhere). I won’t bother here, but you can see it is 106+0+0, 106+0+2, 106+0+4, etc..
- We could know the probability of the answers, some answers being the result of more than one set of choices. (such as 106+0+2 and 106+2+0)
- We could then go on to figure five day trends, means and medians for each of the possible answers, to a high degree of precision. (We would be hampered by the non-existence of fractional-buttons and the actual set only allowing even numbers, but the trends, means and medians would be statistically precisely correct.)
What we couldn’t know:
- How many buttons there would actually be on Friday.
Why couldn’t we know this? We couldn’t know because our model – our button collecting model – contains no information whatever about causes. We have modeled the changes, the effects, and some of the rules we could discover. We don’t know why and under what circumstances and motivations the Button Collector adds or subtracts buttons – we don’t really understand the process – BC’s button collecting — because we have no data about the causes of the effects we can observe or the rules we can deduce.
And, because we know nothing about causes in our process, our model of the process, being magnificently incomplete, can make no useful predictions whatever from existing measurements.
If we were able to discover the causes effective in the process, and their relative strengths, relationships and conditions, we could improve our model of the process.
Back we go to The Button Collector and under a little stronger persuasion he reveals that he has a secret formula for determining whether or not to add or subtract the numbers of buttons previously observed and a formula for determining this. Armed with this secret formula, which is precise and immutable, we can now adjust our model of this button collecting process.
Testing our new, improved, and finally adjusted model, we run it again, pretending it is Wednesday, and see if it predicts Friday’s value. BINGO! ONLY NOW does it give us an accurate prediction of 118 (the already known actual value) – a perfect prediction of a simple, basic, wholly deterministic (if tricky and secret) process by which my eccentric acquaintance adds and subtracts buttons from his collection.
What can and must we learn from this exercise?
1. No statistical trend, no matter how precisely calculated, regardless of its apparent precision or length, has any effect whatever on future values of a data set – never, never and never. Statistical trends, like the data of which they are created, are effects. They are not causes.
2. Models, not trends, can predict, project, or inform about possible futures, to some sort of accuracy. Models must include all of the causative agents involved which must be modeled correctly for relative effects. It takes a complete, correct and accurate model of a process to reliably predict real world outcomes of that process. Models can and should be tested by their abilities to correctly predict already known values within a data set of the process and then tested again against a real world future. Models also are not themselves causes.
3. Future values of a thing represented by a metric in data set output from a model are caused only by the underlying process being modeled–only the actual process itself is a causative agent and only the actual process determines future real world results.
PS: If you think that this was a silly exercise that didn’t need to be done, you haven’t read the comments section at my essay at Dot Earth. It never hurts to take a quick pass over the basics once in a while.
# # # # #
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
“The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.”
— I would say that states things a little more harshly than is justified and that a more correct answer is “We cannot be sure, but the trend indicates a best guess of…” I forecast using trends all the time. A trend is useful even if it is only partially informative. No, we don’t know the underlying equation and we might be wrong but in many cases it doesn’t matter.
What’s missing here is the intended use of the forecast. If your boss is asking you to forecast sales, you’d better not tell him that there’s no way to know for sure. Before firing you he’ll say he knows that, but the business needs to make decisions on inventory and capacity. It doesn’t matter if you’re a bit wrong because you’ll simply adjust the forecast. But if your boss is deciding whether to decarbonize the global economy at a cost of several trillion dollars, then the strength of your model had darn well better justify it. Otherwise, the answer is going to be to keep studying and keep him posted on your progress, but hold off on the big spending.
I think that makes it very clear. It was also a fun read. Thank you.
Nicely put, thanks. Yes, too often people are lulled into thinking the pretty graph has meaning beyond being a useful display device.
Cause is everything, and when it comes to “global warming” there are nowhere near enough causal factors that can be relied upon to produce proof – it is mostly pure speculation, the mind boggles at the use of these graphs to bring in monstrous taxes and trading schemes for miniscule human introduced increases of one of the worlds most important elements for the survival of life on this planet
He will have between 0 and a googleplex of buttons. I’m 95% certain of it.
if the underlying process is regular and invariant, then trend is, indeed, of predictive value with great accuracy.
for example, i need not specify planetary orbital paths nor gravitational constants to tell you that between now and a year from now there will be @ur momisugly 365 cycles of day/night.
but this in no way validates ‘statistics’ as anything remotely scientific. it is a tool of numerologists and no less mystical than tea leaf reading.
nevertheless, i think it may be worth noting that statistics, like the broken watch, has its moments on occasion – if only to make sure this type of occurrence is properly classified as special and exceptional.
author is getting on a high horse over words. I’ve used trends to predict the future b4. Distinguishing between models and statistical trends is a grey area.
jim2 says:
October 17, 2013 at 7:26 pm
He will have between 0 and a googleplex of buttons. I’m 95% certain of it.
I’m 100% sure he’ll have between 0 and infinity (inclusive).
Based on current trends, how many months until WUWT reaches 2 billion views? I guess this cannot be safely predicted as unforeseen events may change current viewing habits.
I am astounded, as a dedicated visitor since well before Climategate, at how rapidly the number of views is mounting – around a million a week. Astounded, but not surprised. Congratulations, WUWT!
On the whole you’re right. However. you don’t need to know every cause. In fact, you can never know when you possess a complete set. You only need to know the more common and major ones. If this weren’t true, then statistical mechanics would have zero value.
There are systems for which there are deterministic, physical causes of trends that can be characterized by parameters whose values are determined by fitting noisy data trends. Future average performance is projected by extrapolating the trend via its parameters. IPCC climate modelers would have us believe that future climate really does depend deterministically on CO2 levels. The obvious failure of their models to predict the current arrest of global temperature anomalies says that they have failed to account for some important physical variables.
It’s funny and sad that we have to remind scientists (and laymen) how to do statistics using pre-school techniques. Well done though, if the articles was shorter, perhaps science journalists would read it too (and not just the abstract).
Two weeks on, the number of buttons was found to have been constant for 17 days. I understand that this matter was then considered by the IPCC (Intergovernmental Panel on Clasps and Closures) and they reached a consensus opinion that, as 10 of the highest observations of number of buttons were made in the previous fortnight, the collection of buttons was clearly continuing to grow.
Sorry but the author is incorrect. Savings==investment. If the saver has chosen not to benefit then the benefit goes to somebody else. A few years ago wealth was transferred from some savers, whose saving were in US dollars, to savers whose savings were in bank shares. About a trillion dollars was stolen from one class of saver and handed to another. Saving always benefits SOMEBODY.
It is not possible to “neither help oneself nor others” by saving. Saving is always put to useful purpose.
The foregoing assumes that the saving are “real”, i.e. more was produced than consumed. Fake savings can be produced by favoured classes by permitting them to write up the value of imaginary assets(that’s what the banker thieves did. Oh and are still doing by the way(QE x)).
So, just because there has been a pause in rise of global mean surface temperature does not mean we can reject the physics of global warming? Rats.
The trend just provides a first approximation for a model. The Pearson statistic may help you decide how much you want to bet on a linear extrapolation. It doesn’t take into consideration the consequences of being wrong. I’m betting I’ll wake up tomorrow, and that I’ll keep waking up each day for some time. Happily, if I’m wrong, I’ll never know. However, there would be consequences for others I’d rather they didn’t have. There is quite a bit of money on the line for insurance, and underwriting is a serious business. Back to trends again. Dealing with relatively large numbers of people helps. Do we have enough people for really good statistics (see DAV above)? We still don’t have our Harry Seldon, nor psychohistory.
Climate variables are not Martingale variables.
I believe that Nassim Taleb refers to this as the “falacy of the turkey” in both the Black Swan and Antifragile. Then again, what does he know…being a independently wealthy Wall Street “pit trader”, who AFTER achieving that success, went back to graduate school and earned his academic credentials. Let’s hope he’s no turkey!
Trends which are stable despite a variety of potential disruptive events over the observed period will induce some confidence that it will continue. Trends which wobble in response to interventions may give some indication of causality. Climate science has neither.
Well done!
Yes, as others have pointed out, we need to make guesses about the future and sometimes those guesses turn out to be accurate (the IPCC is merely precise, lol), but, in the end, all we have based on statistics which only describe the past or present is:
a guess.
It was interesting and informative to read the comments at Dot Earth Kip Hansen. Even relatively intelligent posters have difficulty understanding this simple concept that trend of the past is not necessarily predictive of the future especially at longer time scales. It is clear that many do not understand the complexity of the solar, ocean, atmosphere coupled dynamics. It was especially interesting that your link to Marcia Wyatt’s new paper was ignored. Much has been ignored regarding the oceans while focusing on the atmosphere over the last twenty years.
Kip Hansen: “The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.”
The correct answer is that you’ve made a mash of the whole topic. It’s certainly true that you cannot guarantee now the then that has yet to happen. But statistics are not and have never been a guarantee.
Nor is there any necessity, or even sanity, in attempting at self-fulfilling logical positivist approaches to causality. Newton went commando with Hypothesis non Fingo. Kepler was a pure data fiddler. So was Galileo. Point of fact, any postulated cause has less expectation for correctness, in general, than a long chain of validated predictions arising from a non-causal model. And that’s putting aside any expected issues that arise from the field of physics and it’s distinct reliance on impossible objects taken to every infinite asymptote they can find.
And yet the things still work out. The problem here isn’t statistics, nor trends generally, but one of repetition and validation of long chain of predictions. Especially when you have non-replicable observables and need to see your trend go both up and down. The problem isn’t the math. And it isn’t the models per se.
It’s a complete disregard for validating the results before launching into religious pronouncements that the world is doomed, that CO2 *is* the climate, or that there must be something terribly wrong with math.
“Dewey Beats Truman”
@ur momisugly Aynsley Kellow (8:17pm) — Nicely put. Precisely!
OMG buttons cause global warming.
“Models also are not themselves causes.”
Except the effect they have on the human mind.