The Button Collector or When does trend predict future values?

So, you know now who to call if YOU loose a bu...
How many buttons will he have on Friday? (Photo credit: Wikipedia)

Guest essay By Kip Hansen

INTRO: Statistical trends never determine future values in a data set. Trends do not and cannot predict future values. If these two statements make you yawn and say “Why would anyone even have to say that? It is self-evident.” then this essay is not for you, you may go do something useful for the next few minutes while others read this. If you had any other reaction, read on. For background, you might want to read this at Andrew Revkin’s NY Times Dot Earth blog.

­­­­­­I have an acquaintance that is a fanatical button collector. He collects buttons at every chance, stores them away, thinks about them every day, reads about buttons and button collecting, spends hours every day sorting his buttons into different little boxes and bins and worries about safeguarding his buttons. Let’s call him simply The Button Collector or BC, for short.

Of course, he doesn’t really collect buttons, he collects dollars, yen, lira, British pounds sterling, escudos, pesos…you get the idea. But he never puts them to any useful purpose, neither really helping himself or helping others, so they might as well just be buttons, so I call him: The Button Collector. BC has millions and millions of buttons – plus 102. For our ease today, we’ll consistently leave off the millions and millions and we’ll say he has just the 102.

On Monday night, at 6 PM, BC counts his buttons and finds he has 102 whole buttons (we will have no half buttons here please); Tuesday night, he counts again: 104 buttons; on Wednesday night, 106. With this information, we can do wonderful statistical-ish things. We can find the average number of buttons over three days (both mean and median). Precisely 104.

We can determine the statistical trend represented by this three-day data set. It is precisely +2 buttons/day. We have no doubts, no error bars, no probabilities (we have 100% certainty for each answer).

How many buttons will there be Friday night, two days later? 

If you have answered with any number or a range of numbers, or even let a number pass through your mind, you are absolutely wrong.

The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.

But, you might argue, the trend is precisely, perfectly, scientifically statistically +2 buttons/day and two days pass, therefore there will be 110 buttons. All but the final phrase is correct, the last — “therefore there will be 110 buttons” — is wrong.

We know only the numbers of buttons counted each of the three days – the actual measurements of number of buttons. Our little three point trend is just a graphic report about some measurements. We know also, importantly, the model for the taking the measurements – exactly how we measured — a simple count of whole buttons, as in 1, 2, 3, etc..

We know how the data was arrived at (counted), but we don’t know the process by which buttons appear in or disappear from BC’s collection.

If we want to be able to have any reliable idea about future button counts, we must have a correct and complete model of this particular process of button collecting. It is really little use to us to have a generalized model of button collecting processes because we want a specific prediction about this particular process.

Investigating, by our own observation and close interrogation of BC, we find that my eccentric acquaintance has the following apparent button collecting rules:

  • He collects only whole buttons – no fractional buttons.
  • Odd numbers seem to give him the heebie-jeebies, he only adds or subtracts even numbers of buttons so that he always has an even number in the collection.
  • He never changes the total by more than 10 buttons per day.

These are all fictional rules for our example; of course, the actual details could have been anything. We then work these into a tentative model representing the details of this process.

So now that we have a model of the process; how many buttons will there be when counted on Friday, two days from now?

Our new model still predicts 110, based on trend, but the actual number on Friday was 118.

The truth being: we still didn’t know and couldn’t have known.

What we could know on Wednesday about the value on Friday:

  • We could know the maximum number of buttons – 106 plus ten twice = 126
  • We could know the minimum – 106 minus ten twice = 86
  • We could know all the other possible numbers (all even, all between 86 and 126 somewhere). I won’t bother here, but you can see it is 106+0+0, 106+0+2, 106+0+4, etc..
  • We could know the probability of the answers, some answers being the result of more than one set of choices. (such as 106+0+2 and 106+2+0)
  • We could then go on to figure five day trends, means and medians for each of the possible answers, to a high degree of precision. (We would be hampered by the non-existence of fractional-buttons and the actual set only allowing even numbers, but the trends, means and medians would be statistically precisely correct.)

What we couldn’t know:

  • How many buttons there would actually be on Friday.

Why couldn’t we know this? We couldn’t know because our model – our button collecting model – contains no information whatever about causes. We have modeled the changes, the effects, and some of the rules we could discover. We don’t know why and under what circumstances and motivations the Button Collector adds or subtracts buttons – we don’t really understand the process – BC’s button collecting because we have no data about the causes of the effects we can observe or the rules we can deduce.

And, because we know nothing about causes in our process, our model of the process, being magnificently incomplete, can make no useful predictions whatever from existing measurements.

If we were able to discover the causes effective in the process, and their relative strengths, relationships and conditions, we could improve our model of the process.

Back we go to The Button Collector and under a little stronger persuasion he reveals that he has a secret formula for determining whether or not to add or subtract the numbers of buttons previously observed and a formula for determining this. Armed with this secret formula, which is precise and immutable, we can now adjust our model of this button collecting process.

Testing our new, improved, and finally adjusted model, we run it again, pretending it is Wednesday, and see if it predicts Friday’s value. BINGO! ONLY NOW does it give us an accurate prediction of 118 (the already known actual value) – a perfect prediction of a simple, basic, wholly deterministic (if tricky and secret) process by which my eccentric acquaintance adds and subtracts buttons from his collection.

What can and must we learn from this exercise?

1. No statistical trend, no matter how precisely calculated, regardless of its apparent precision or length, has any effect whatever on future values of a data set – never, never and never. Statistical trends, like the data of which they are created, are effects. They are not causes.

2. Models, not trends, can predict, project, or inform about possible futures, to some sort of accuracy. Models must include all of the causative agents involved which must be modeled correctly for relative effects. It takes a complete, correct and accurate model of a process to reliably predict real world outcomes of that process. Models can and should be tested by their abilities to correctly predict already known values within a data set of the process and then tested again against a real world future. Models also are not themselves causes.

3. Future values of a thing represented by a metric in data set output from a model are caused only by the underlying process being modeled–only the actual process itself is a causative agent and only the actual process determines future real world results.

PS: If you think that this was a silly exercise that didn’t need to be done, you haven’t read the comments section at my essay at Dot Earth. It never hurts to take a quick pass over the basics once in a while.

# # # # #

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

222 Comments
Inline Feedbacks
View all comments
Matthew R Marler
October 18, 2013 11:25 am

Richard S Courtney: Sorry, but the human brain is very good at assessing trajectories (i.e. trends in the spatial changes of objects). Ball games would not be possible if this were not so.
Today we are in agreement. That’s an excellent example.

Editor
October 18, 2013 11:28 am

Some random replies:
“How many people do you know who are familiar with Hume, Jaynes, or others of that caliber?” None, including myself 🙂 I am a lowly “practician”.
Howard Booth (representing a substantial number of similar comments) says: “While it is easy (and perhaps a bit enjoyable) to poke fun at “scientists” who hang their hats on statistical trending and modeling, it would be disingenuous to say there is zero value in using modeling as a predictive tool.”
My essay pokes no fun at anyone (well, possibly a bit at my eccentric acquaintance, The Money Collector, who actually exists, btw). My essay is about common misunderstandings regarding the naïve idea that trend lines on graphs of things can predict the future. There are important scientific fields called statistical modeling and forecasting. Both are perfectly valid in their own realms and both have very specific rules and procedures and principles that must be followed to ensure that their results are useful and valid. Mr. Booth, and those with this concern, need only read the comments to my post at Dot Earth—[ http://dotearth.blogs.nytimes.com/2013/10/09/on-walking-dogs-and-global-warming-trends/ ] —to understand why this simplistic essay has been written. While I am sure that the truth is perfectly clear to you (and others with a similar concern), there are persons who profess to believe that trends determine future values, in and of themselves.
In regards to Non-Linearity: To almost whatever you have to say about it, I say “I agree”. I have followed Lorenz and non-linearity since my youth and so, YES, to whatever you want to say about it and its ramifications on the subject of trends, models, and processes. Most of the comments revolve around this: the modeling of a non-linear system, even if it is composed of a single simple wholly deterministic formula used re-iteratively, produces results that can be said to be chaotic and “unpredictable”. Most here at WUWT agree with the IPCC that the Earth’s climate system is a “bounded non-linear chaotic system”.

Doug Huffman
October 18, 2013 11:40 am

Kip Hansen says: October 18, 2013 at 9:48 am “But why can we do so, since trends cannot and do not predict the future?”
E. T. Jaynes, Probability Theory, 3.2 Logic vs. propensity (p 62) “In all the sciences, logical inference is more generally applicable. We agree that physical influences can propagate only forward in time; but logical inferences propagate equally well in either direction.”
I’ve highlighted this paragraph, while most of my notes are in pencil. I remembered it as; cause and effect are constrained to the arrow of time, logic not so. This paragraph follows, three paragraphs after, a criticism of Popper’s ‘propensity’.

October 18, 2013 11:49 am

meemoe_uk says:
October 17, 2013 at 7:38 pm
author is getting on a high horse over words…
I couldn’t agree more! The essay you posted on this topic was much more informative and riveting. Thank you for taking the time to educate the audience and to put your thoughts out for public criticism.
Eric

Editor
October 18, 2013 11:55 am

Reply to richardscourtney (October 18, 2013 at 11:13 am) :
I have no wish to embarrass Richard but there are quite a few comments that use this or a similar logical fallacy to confuse themselves about trends:
He says: “Sorry, but the human brain is very good at assessing trajectories (i.e. trends in the spatial changes of objects). Ball games would not be possible if this were not so.”
As I have pointed out previously, confusing “trajectories” with “trends” will lead you wrong every time. A trajectory is not a trend – they only look the same when graphed on a piece of paper. A trajectory is the path of a moving object based on the laws of Newtonian physics—it is part of an ongoing physical process. It is not a numerical construct or the output of a numerical model (though one could easily model the process and add the details of a trajectory that one could use to place the Rover gently on the surface of Mars). Conflating trajectory—a physical process—and trend—a numerical report about the past of a process—is a grievous logical error.
And “A trend can and does predict the future so long as the trend continues into the future. But trends change with time and, therefore, trends are imperfect predictors of the future.” It is the physical process that is likely to continue into the future—the trend is simply a report and has no existence until new data are added moment by moment. The model will tell us if the physical process is likely to produce a new datum in line with the past trend. To say that “all of my correct predictions will be correct if they are correct and my incorrect predictions incorrect” says nothing more than “sometimes I guess right, sometimes I guess wrong.”—there is nothing there about prediction, it is just guessing—reference Feynman regarding scientific guessing.

Brent Hargreaves
October 18, 2013 12:09 pm

“He will have between 0 and a googleplex of buttons. I’m 95% certain of it.
I’m 100% sure he’ll have between 0 and infinity (inclusive).”
Arts graduate’s view of the above: Why limit oneself to just that range? Those pesky scientific types are SO unimaginative, so black-and-white, so obsessed with right and wrong.

GunnyGene
October 18, 2013 12:15 pm

Doug Huffman says:
October 18, 2013 at 11:40 am
Re: Jaynes
Good to know someone else who has studied him. 🙂 I’m sure, then, that you’ve also heard the following:
“Probabilities do not exist.” And: “In the absence of Reality, Probability rules.” Both of which require considerable explanation which I won’t go into here, and the concepts can be difficult for many to wrap their heads around.

October 18, 2013 12:28 pm

I am one of the one’s you told not to read it, because I already understood it, but I read it anyway, apparently there are still a lot people around here who still don’t understand it.

GunnyGene
October 18, 2013 12:33 pm

Kip Hansen says:
October 18, 2013 at 11:28 am
Some random replies:
“How many people do you know who are familiar with Hume, Jaynes, or others of that caliber?” None, including myself 🙂 I am a lowly “practician”.
****************************************************************
As was I for many years. But, I’d recommend Jaynes to you anyway. Here’s a link you can pursue at your leisure: http://omega.albany.edu:8008/JaynesBook.html

October 18, 2013 1:00 pm

…Current performance may be lower or higher than the quoted past performance, which cannot guarantee future results…

Investors get to read these informative words or phrases that are very similar on virtually investment returns page.
Why?
Well, these people are held accountable otherwise.

“Steve Obeda says: October 17, 2013 at 7:06 pm
“The only correct answer is: We have no idea how many buttons he will have Friday night because we cannot see into the future.”
— I would say that states things a little more harshly than is justified and that a more correct answer is “We cannot be sure, but the trend indicates a best guess of…” I forecast using trends all the time. A trend is useful even if it is only partially informative. No, we don’t know the underlying equation and we might be wrong but in many cases it doesn’t matter.
What’s missing here is the intended use of the forecast. If your boss is asking you to forecast sales, you’d better not tell him that there’s no way to know for sure. Before firing you he’ll say he knows that, but the business needs to make decisions on inventory and capacity. It doesn’t matter if you’re a bit wrong because you’ll simply adjust the forecast. But if your boss is deciding whether to decarbonize the global economy at a cost of several trillion dollars, then the strength of your model had darn well better justify it. Otherwise, the answer is going to be to keep studying and keep him posted on your progress, but hold off on the big spending.”

That never makes the trend accurate or even mostly correct. It is a deliverable, period. Try using the ‘official cya’ chit that financial institutions use to protect their butts. Your boss will get just as mad and it won’t make your trend any more accurate. So the trend providers who don’t like suffering often provide updated trends as frequently as their data sources update. That’s the only way to account for changing variables; like the economy tanking overnight and customers stop buying immediately and distributors start bellowing about unsold product while your suppliers are demanding payment for inventory just delivered… Oh yeah! Trends told your boss everything.
A trend is old the day it is made and it’s age escalates as time trickles by,

“Fred the Statistician says: October 17, 2013 at 10:38 pm
“The trend is utterly useless as a predictor — always.”
This is absolutely incorrect. In the absence of any other data, past trends are the BEST predictor of future values.
This entire article shows a complete failure to understand statistics. Statistics can be used to make predictions, based on certain assumptions. If those assumptions are correct, the prediction is MORE LIKELY to be correct. If the assumptions are wrong, then the prediction is LESS LIKELY to be correct. No where are guarantees made…
…”

“Fred the Statistician says: October 17, 2013 at 10:38 pm

The correct prediction in the article would have been “Given the assumptions we are making (iid, normal, etc), we predict that it is most likely that there will be 110 buttons, give or take (a very large number).” No statistician would EVER say “There WILL BE 110 buttons.” Because statisticians actually understand statistics.
… “

Assumptions! Which preferably are accurate observations masquerading as assumptions.
Unfortunately today’s observations are minimal and everything later is unknown, totally unknown. While the computation gives a ‘confidence’ interval it is rarely very comforting when one hands trends off to the boss or public.
It may be an educated guess, but it is still a guess.

“Fred the Statistician says: October 17, 2013 at 10:38 pm

“We can’t know the future” is an utterly silly argument. Will the sun rise in the East tomorrow? I hope you didn’t say “yes”, because you can’t know the future! If I let go of this pencil I’m holding in the air, will it fall to the ground? I hope you didn’t say “yes”, because you can’t know the future!
… “

Ever do a trend on the sun? What did you use for assumptions?
If you drop your pencil, it will fall to the ground because gravity or the ‘earth’s bent space’ condition is active. Your dropping or implying you are going to drop a pencil is ‘not’ a trend! It is a single action. Now if you always drop your pencil three minutes prior to leaving work, that we can trend. But only the act of you dropping the pencil, gravity is still a relative constant (i.e. any changes are undetectable to casual or even intensive observation).
If all assumptions are not only correct, but stay reasonably consistent for the time duration covered by the trend; then we can make an accurate guess.
Yeah, I did trends for a large organization; some daily, some weekly, everything on the accounting period, fiscal year and for 5, 10, 20 year plans. Five years plans are a joke as they rarely were worth much beyond “What in the world did we think we were going to do?” historical amusement, mostly because some executive belief’s were masquerading as assumptions and as soon as executive’s change or their minds change, that major assumption is toast.
I also learned to check daily journal (accounting) entries watching for surprise hits.
When someone would demand to know when I would deliver an accurate plan (trend); I would usually respond with “As soon as you can tell me exactly how many workhours will be used, product delivered, items purchased, capital equipment purchased and facilities built and maintained.”
Repeat! If all assumptions are not only correct, but stay reasonably consistent for the time duration covered by the trend; then we can make an accurate guess.
Guess what major blind faith movement is betting everything that has consistently proved their assumptions are wrong?
Perhaps we should be doing a performance trend on how models, modelers, and modeler PR people?

John Whitman
October 18, 2013 1:09 pm

Kip Hansen,
Your ‘Buttons’ has attracted an excellent statistical audience with lively metaphysics / epistemological discourse. Thanks. It doesn’t get more enjoyable than this.
Hey, just yesterday I started on Jaynes’ book and rgbatduke’s partial draft of his book project ‘Axioms’. So, my statistical paradigm is being tested right now. I’ll need to get back to you later on your ‘Buttons’.
Personal Note=> I was quite critical of Popper’s science theory related thesis starting in the 1980s after reading Brand Blanshard’s ‘Reason and Anaylisis’ but now with Jaynes and Brown input I anticipate becoming more critical of Popper.
John

richardscourtney
October 18, 2013 1:38 pm

Kip Hansen:
I am replying to your post at October 18, 2013 at 11:55 am in response to my post at October 18, 2013 at 11:13 am).
You say

I have no wish to embarrass Richard but there are quite a few comments that use this or a similar logical fallacy to confuse themselves about trends:

I made no “logical fallacy” and you do not cite one. The only person whom you “embarrass” is yourself by your answer to my point.
As always when involved in a semantic disagreement, a definition of terms is needed. In this case, there is a dictionary definition and a mathematical description of “trend”. This is the definition according to the Online Dictionary.

trend (trnd)
n.
1. The general direction in which something tends to move.
2. A general tendency or inclination. See Synonyms at tendency.
3. Current style; vogue: the latest trend in fashion.

Clearly, we are discussing “1. The general direction in which something tends to move.”
And in statistics, a trend is detected and observed in a time series is modelled. This is what the ‘Statistics Glossary’ says
http://www.stats.gla.ac.uk/steps/glossary/time_series.html

Trend is a long term movement in a time series. It is the underlying direction (an upward or downward tendency) and rate of change in a time series, when allowance has been made for the other components.
A simple way of detecting trend in seasonal data is to take averages over a certain period. If these averages change with time we can say that there is evidence of a trend in the series. There are also more formal tests to enable detection of trend in time series.
It can be helpful to model trend using straight lines, polynomials etc.

Clearly, you are plain wrong when you say of me

He says: “Sorry, but the human brain is very good at assessing trajectories (i.e. trends in the spatial changes of objects). Ball games would not be possible if this were not so.”
As I have pointed out previously, confusing “trajectories” with “trends” will lead you wrong every time. A trajectory is not a trend – they only look the same when graphed on a piece of paper. A trajectory is the path of a moving object based on the laws of Newtonian physics—it is part of an ongoing physical process.

A trajectory is
“The general direction in which something tends to move”
and it can be recorded as an incremental time series
which can then be analysed to determine the form of the trend
that can be plotted on graph paper.
You have confused the statistical record of the trend as being the trend itself. And the actual trajectory IS the trend itself.
You are also wrong when you quote me and dispute my quote saying

And “A trend can and does predict the future so long as the trend continues into the future. But trends change with time and, therefore, trends are imperfect predictors of the future.” It is the physical process that is likely to continue into the future—the trend is simply a report and has no existence until new data are added moment by moment.

No, the trend is ““The general direction in which something tends to move” and the “report” is of that “general direction”. The trend will end probably because the physical process which causes the trend will amend or ended, but the trend is NOT the physical process; the trend is a result of that physical process(es) which could be random chance.
And I am fully conversant with Feynman.
Richard

GunnyGene
October 18, 2013 1:51 pm

John Whitman says:
October 18, 2013 at 1:09 pm
Kip Hansen,
Your ‘Buttons’ has attracted an excellent statistical audience with lively metaphysics / epistemological discourse. Thanks. It doesn’t get more enjoyable than this.
****************************************************************************
Indeed. 🙂 Of course, all of this is our feeble human attempt to discern future Reality. So from a probabilistic pov, we must first define what reality is. The prevailing definition is when P=1, which is commonly termed an “Event” (hopefully a desired outcome). So the decision is one of choosing the set of probabilities that will result in the desired outcome, based on our level of knowledge and our influence over the process that (we hope) will lead to that outcome. Since we don’t have perfect knowledge we cannot have perfect confidence in our actions to attain that desired outcome. The Black Swan is always hovering. You pays your money, and takes your chances. 🙂

Pamela Gray
October 18, 2013 1:52 pm

We do trend analysis in education all the time. Rate of improvement is one way to determine whether or not learning pace is sufficient to catch up to grade level. Unfortunately, many educators do not know what they are looking at, don’t use a defensible form of calculating a linear trend, and do not use this analysis appropriately even when calculated correctly.
For example, when measuring reading speed improvement over time many focus on the linear trend and ignore the data spread. Not a good thing to do. I rarely see a tight time-dependent correlation, instead seeing data spread above and below the linear trend in odd herky jerky fashion. Yet educators use the trend line to adjust instruction, assuming that instruction is the driver of the trend line and individual data points can be safely ignored.
Nothing could be further from the truth. Data spread is where the focus should be, not on the trend line itself. Sometimes the reason for far flung performance above and below the trend line has nothing to do with instruction. So changing instruction in hopes of a better trend, or worse, measuring teacher performance on such a metric, is patently wrong. And for the exact reasons outlined in the post above.
4 marks.

Editor
October 18, 2013 2:25 pm

Reply to Richard:
We’ll just have to see what others think about your view.
We can’t discuss it if you insist that trajectory of a cannonball is really a statistical trend and I maintain that the trajectory (the physical path that the baseball has followed so far and its future projected path based on Newtonian physics) of the baseball is an actual physcial world process. I maintain that even though the words used describing the two are very similar, they are not the same nature of thing.
My view includes the point that I could model the path of the baseball, and my model, because it is based on Newtonian physics, would produce a fairly precise projected tragectory for the ball. The projected path would look like a “curving trend” on a graph, but it is not a statistical trend.
Thus, we’ll have to leave it for another time. Thank you for participating.

Editor
October 18, 2013 2:26 pm

Yikes — too many balls! I believe Richard is speaking of baseballs and not cannonballs.

Editor
October 18, 2013 2:30 pm

GunnyGene: Thank for the link to Jaynes’ book.

Ted Carmichael
October 18, 2013 2:30 pm

Hi, Kip. You said, “My essay is about common misunderstandings regarding the naïve idea that trend lines on graphs of things can predict the future. [ ] While I am sure that the truth is perfectly clear to you (and others with a similar concern), there are persons who profess to believe that trends determine future values, in and of themselves.”
I’m having a hard time understanding why you are persisting in these arguments. You now seem to be saying that there are people out that who believe that dots on a page with a trend line drawn through them is actually the causal agent that determines the future. And (or so it seems) you are arguing rigorously against this proposition.
It is my feeling that, in your zeal to dissuade these phantom prognosticators, you have gone too far in the other direction, and made some statements that are also wrong.
No one believes that a trend on a graph is a causal agent. It is completely and utterly understood, as Richard Courtney stated above, that the data + analysis is merely the statistical record of what has occurred in the real world. As such, that analysis – using time-tested methods – can give a solid prediction of the future. How solid that prediction is, and at what level of precision, depends upon the quality and characteristics of the data, and the skill of the statistician performing the analysis.
I gave you the best version of an empirical model that I could think of: gravity. We do not know anything about the causal agents of gravity (other than, as I said, a few recent hypotheses that have not yet been proven). Yet we know very precisely how gravity behaves. We know this (and I am trying to stick to ‘common sense’ language, as you stated is your preference) only because of a vast number of measurements that fit very well into fixed and consistent formulas. It is wholly an empirical theory. Newton came up with the main formula based (mainly) on observations of planetary orbits; Einstein refined Newton’s findings based (mainly) on measurements of the speed of light. The theory of gravity is thus consistent, but not with any need or reliance on causal agents – it is consistent with our observations. It is empirical.
When someone says “The trend predicts X,” what they mean is that the discovered pattern is robust enough to make a solid prediction about the future. You can argue that the level of certainty is too high, or that the level of precision is overstated. (Depending on the particulars, of course.) But you cannot say that an identified trend has no information to offer about the future.
In your BC example, you gave a very short and simple trend: 102, 104, and 106, on three successive days. And then you said – and here is where you went too far – that this trend tells us “absolutely nothing” about the number of buttons on day five. In other words, you state that day five’s value can be anything; that 110 is just as likely as 3; or 57; or 72,639; or ten billion. I would say, instead, that 110 is the best guess, and is more likely to be right than any other value (even though, with only three data points, it is not a very certain guess).
Statistics can quantify this guess and tell us the probability that it is correct. In fact, that is all that statistics does. I don’t think I really understand what part of this you don’t agree with.
Cheers.

Editor
October 18, 2013 2:34 pm

John Whitman: Do you have a link to “rgbatduke’s partial draft of his book project ‘Axioms’.”?

Editor
October 18, 2013 2:41 pm

Doug Huffman: Re Jaynes’ “We agree that physical influences can propagate only forward in time; but logical inferences propagate equally well in either direction.” IMHO, this is what allows us to use both inductive and deductive reasoning to work out models from known facts and forward to facts from known models. Not my area of specialty.

richardscourtney
October 18, 2013 2:51 pm

Kip Hansen:
re your post at October 18, 2013 at 2:25 pm.
I am not impressed with your sign off of, “Thankyou for participating”.
I have pointed out and explained that you are confusing the result of a set of physical processes (i.e. a trend) as being the physical processes.
The trend will change when the processes alter or end. And that is why a trend is a future predictor although an imperfect future predictor.
You say the trend is not a predictor but the set of physical processes is.
If you were right then farmers would not have planted their crops at the right time of year for millenia. Fortunately, farmers understood what the trend of past seasons predicted although they had no knowledge of orbital mechanics.
As you said, thankyou for participating.
Richard

rgbatduke
October 18, 2013 2:54 pm

Yes, cited above! Particularly Section 5.3 ‘Converging and diverging views’ (pp 126 – 132) excerpted here
http://www.variousconsequences.com/2009/11/converging-and-diverging-views.html

I have a different but related example that addresses the same sort of thing. In The Tenth Kingdom movie/series, close to the end, Tony and his friends encounter an enchanted frog in front of two doors. The frog croaks out “One door leads to freedom; the other door leads to certain death. You may ask me one question, but I always lie!”
Tony proceeds to rant about how silly it is (even in fairy stories) to have a door that leads to certain death guarded by a lying enchanted frog, picks up the frog and throws him through one of the two doors. Explosion. Wolf says “I guess it is the other one”.
This is a marvelous example in logic as well as being funny as all hell. What does Tony know? Well, the frog says that it always lies. This is a statement that literally cannot be true — “I always lie” is necessarily a lie because it cannot be true. However, nothing stops one from having a frog that sometimes lies.
Confronted with a frog that cannot be counted on not to deceive you, what’s the best course? Empiricism! Test the doors to find out! And what better to use to test it with than lying frog!
That still doesn’t prove that the OTHER door doesn’t lead to certain death, and of course it does. All doors lead to certain death. Some perhaps sooner than others…;-)
rgb

rgbatduke
October 18, 2013 2:56 pm

John Whitman: Do you have a link to “rgbatduke’s partial draft of his book project ‘Axioms’.”?
http://www.phy.duke.edu/~rgb/axioms.pdf
There’s a still earlier version under one of the menus at the top of my personal website as well.
rgb

Editor
October 18, 2013 2:58 pm

Ted Carmichael: Have you been trained in statistics? My guess is yes, you have. That’s the problem here. I am not trained in statistics, I am a “practician” (practicing the art of the practical).
You want to talk statistics and probability.
As a practician, I insist : The numbers 102, 104, 106, given in order can inform us of nothing whatever except the counts of buttons those particular three days (and the offered fact that they were counted, not weighed or averaged or something). That’s it. Nothing else — nothing possible, nothing probable, nothing about tomorrow’s value at all. I can say this with 100% certainty because we, as yet, know absolutely nothing whatever about the process represented by these numbers, except our expectation they will be counted again tomorrow.
Please don’t go on if you haven’t at least read all my replies so far on this post, you are already forcing me to repeat stuff I have already repeated several times.
Now, quite honestly, if you hope to make your point, you must have a logical, scientific, or even theological reason for believing that you can make any kind of forecast or prediction at all in the utter absence of any understanding of what is being predicted.
I will be glad to listen directly to whatever Statistical Expert you may wish to supply in this regard — but I don’t think that anyone will say you can make a prediction from a position of total ignorance.

rgbatduke
October 18, 2013 3:16 pm

“How many people do you know who are familiar with Hume, Jaynes, or others of that caliber?” None, including myself 🙂 I am a lowly “practician”.
(Raises hand…)
I have no idea what a “lowly practicianor” is in the context of statistics. Either you understand the concepts and have worked through the math (in which case you understand precisely how, and in what sense, and under what conditions statistical trends are predictive) or you are just saying lots of things that are not, as I pointed out, true. In which case you would be well advised not to write entire untrue and misleading articles.
Myself, I’m on my second company founded on predictive modelling. We sell a service that is worth money — quite a lot of money — because you are incorrect. If you were, in fact, correct, we’d instantly go out of business. Not only do we build predictive models, we do so without any strong underlying assumptions concerning the nonlinear function that describes the joint probability distribution we are modelling. They are effectively almost purely empirical models, in other words, with only a few Bayesian assumptions that go into the structures. I understand this stuff very well indeed.
Hence my suggestion that you START by learning some actual statistics before you make egregiously incorrect claims about statistics (which not only I, but pretty much every person knowledgeable about the subject have tried to correct you on, in many cases with nearly identical comments and yet independently — in fact, there is a trend here that I expect to continue as we continue to sample the knowledgeable fraction of WUWT readers…hmmm:-).
Again, if your goal is to justify a statement such as “GCMs suck!” feel free to make the statement, and there are many ways to justify such a claim. But don’t try to justify it by stating “predictive models, statistical or otherwise, never work”. That’s nonsense. As a previous poster noted, if one builds a model correctly, the “trends” it extracts from the data are indeed in a mathematically rigorous sense the best prediction of the future, almost always able to an assumption of complete ignorance of the future. There is literally a century of statistical theory, maximum likelihood, maximum entropy, Bayesian reasoning, and more to justify this. Exceptions to this principle are empirically rare — indeed, truly “unpredictable” means “maximum entropy” in a sense that is rarely realized in nature, in the sense that outcomes are completely, truly, random. If you read David MacKay’s online book on Artificial Intelligence and Information Theory (which is hard going, be warned) you will learn about information redundancy and compressibility and perhaps get some insight into why your assertions are not only wrong, they are deeply wrong.
rgb