Guest essay By Kip Hansen
INTRO: Statistical trends never determine future values in a data set. Trends do not and cannot predict future values. If these two statements make you yawn and say “Why would anyone even have to say that? It is self-evident.” then this essay is not for you, you may go do something useful for the next few minutes while others read this. If you had any other reaction, read on. For background, you might want to read this at Andrew Revkin’s NY Times Dot Earth blog.
I have an acquaintance that is a fanatical button collector. He collects buttons at every chance, stores them away, thinks about them every day, reads about buttons and button collecting, spends hours every day sorting his buttons into different little boxes and bins and worries about safeguarding his buttons. Let’s call him simply The Button Collector or BC, for short.
Of course, he doesn’t really collect buttons, he collects dollars, yen, lira, British pounds sterling, escudos, pesos…you get the idea. But he never puts them to any useful purpose, neither really helping himself or helping others, so they might as well just be buttons, so I call him: The Button Collector. BC has millions and millions of buttons – plus 102. For our ease today, we’ll consistently leave off the millions and millions and we’ll say he has just the 102.
On Monday night, at 6 PM, BC counts his buttons and finds he has 102 whole buttons (we will have no half buttons here please); Tuesday night, he counts again: 104 buttons; on Wednesday night, 106. With this information, we can do wonderful statistical-ish things. We can find the average number of buttons over three days (both mean and median). Precisely 104.
We can determine the statistical trend represented by this three-day data set. It is precisely +2 buttons/day. We have no doubts, no error bars, no probabilities (we have 100% certainty for each answer).
How many buttons will there be Friday night, two days later?
If you have answered with any number or a range of numbers, or even let a number pass through your mind, you are absolutely wrong.
The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.
But, you might argue, the trend is precisely, perfectly, scientifically statistically +2 buttons/day and two days pass, therefore there will be 110 buttons. All but the final phrase is correct, the last — “therefore there will be 110 buttons” — is wrong.
We know only the numbers of buttons counted each of the three days – the actual measurements of number of buttons. Our little three point trend is just a graphic report about some measurements. We know also, importantly, the model for the taking the measurements – exactly how we measured — a simple count of whole buttons, as in 1, 2, 3, etc..
We know how the data was arrived at (counted), but we don’t know the process by which buttons appear in or disappear from BC’s collection.
If we want to be able to have any reliable idea about future button counts, we must have a correct and complete model of this particular process of button collecting. It is really little use to us to have a generalized model of button collecting processes because we want a specific prediction about this particular process.
Investigating, by our own observation and close interrogation of BC, we find that my eccentric acquaintance has the following apparent button collecting rules:
- He collects only whole buttons – no fractional buttons.
- Odd numbers seem to give him the heebie-jeebies, he only adds or subtracts even numbers of buttons so that he always has an even number in the collection.
- He never changes the total by more than 10 buttons per day.
These are all fictional rules for our example; of course, the actual details could have been anything. We then work these into a tentative model representing the details of this process.
So now that we have a model of the process; how many buttons will there be when counted on Friday, two days from now?
Our new model still predicts 110, based on trend, but the actual number on Friday was 118.
The truth being: we still didn’t know and couldn’t have known.
What we could know on Wednesday about the value on Friday:
- We could know the maximum number of buttons – 106 plus ten twice = 126
- We could know the minimum – 106 minus ten twice = 86
- We could know all the other possible numbers (all even, all between 86 and 126 somewhere). I won’t bother here, but you can see it is 106+0+0, 106+0+2, 106+0+4, etc..
- We could know the probability of the answers, some answers being the result of more than one set of choices. (such as 106+0+2 and 106+2+0)
- We could then go on to figure five day trends, means and medians for each of the possible answers, to a high degree of precision. (We would be hampered by the non-existence of fractional-buttons and the actual set only allowing even numbers, but the trends, means and medians would be statistically precisely correct.)
What we couldn’t know:
- How many buttons there would actually be on Friday.
Why couldn’t we know this? We couldn’t know because our model – our button collecting model – contains no information whatever about causes. We have modeled the changes, the effects, and some of the rules we could discover. We don’t know why and under what circumstances and motivations the Button Collector adds or subtracts buttons – we don’t really understand the process – BC’s button collecting – because we have no data about the causes of the effects we can observe or the rules we can deduce.
And, because we know nothing about causes in our process, our model of the process, being magnificently incomplete, can make no useful predictions whatever from existing measurements.
If we were able to discover the causes effective in the process, and their relative strengths, relationships and conditions, we could improve our model of the process.
Back we go to The Button Collector and under a little stronger persuasion he reveals that he has a secret formula for determining whether or not to add or subtract the numbers of buttons previously observed and a formula for determining this. Armed with this secret formula, which is precise and immutable, we can now adjust our model of this button collecting process.
Testing our new, improved, and finally adjusted model, we run it again, pretending it is Wednesday, and see if it predicts Friday’s value. BINGO! ONLY NOW does it give us an accurate prediction of 118 (the already known actual value) – a perfect prediction of a simple, basic, wholly deterministic (if tricky and secret) process by which my eccentric acquaintance adds and subtracts buttons from his collection.
What can and must we learn from this exercise?
1. No statistical trend, no matter how precisely calculated, regardless of its apparent precision or length, has any effect whatever on future values of a data set – never, never and never. Statistical trends, like the data of which they are created, are effects. They are not causes.
2. Models, not trends, can predict, project, or inform about possible futures, to some sort of accuracy. Models must include all of the causative agents involved which must be modeled correctly for relative effects. It takes a complete, correct and accurate model of a process to reliably predict real world outcomes of that process. Models can and should be tested by their abilities to correctly predict already known values within a data set of the process and then tested again against a real world future. Models also are not themselves causes.
3. Future values of a thing represented by a metric in data set output from a model are caused only by the underlying process being modeled–only the actual process itself is a causative agent and only the actual process determines future real world results.
PS: If you think that this was a silly exercise that didn’t need to be done, you haven’t read the comments section at my essay at Dot Earth. It never hurts to take a quick pass over the basics once in a while.
# # # # #