McIntyre on Stephen Schneider

An excerpt from Steve’s post at Climate Audit

Schneider replied that he had been editor of Climatic Change for 28 years and, during that time, nobody had ever requested supporting data, let alone source code, and he therefore required a policy from his editorial board approving his requesting such information from an author. He observed that he would not be able to get reviewers if they were required to examine supporting data and source code. I replied that I was not suggesting that he make that a condition of all reviews, but that I wished to examine such supporting information as part of my review, was willing to do so in my specific case (and wanted to do so under the circumstances) and asked him to seek approval from his editorial board if that was required.

This episode became an important component of Climategate emails in the first half of 2004. As it turned out (though it was not a point that I thought about at the time), both Phil Jones and Ben Santer were on the editorial board of Climatic Change. Some members of the editorial board (e.g. Pfister) thought that it would be a good idea to require Mann to provide supporting code as well as data. But both Jones and Santer lobbied hard and prevailed on code, but not data. They defeated any requirement that Mann supply source code, but Schneider did adopt a policy requiring authors to supply supporting data.

I therefore re-iterated my request as a reviewer for supporting data – including the residuals that Climategate letters show that Mann had supplied to CRU (described as his “dirty laundry”). The requested supporting data was not supplied by Mann and his coauthors and I accordingly submitted a review to Climatic Change, observing that Mann et al had flouted the new policy on providing supporting data. The submission was not published. I observed on another occasion that Jones and Mann (2004) contained a statement slagging us, based on a check-kiting citation to this rejected article.

During this exchange, I attempted to write thoughtfully to Schneider about processes of due diligence, drawing on my own experience and on Ross’ experience in econometrics. The correspondence was fairly lengthy; Schneider’s responses were chatty and cordial and he seemed fairly engaged, though the Climategate emails of the period perhaps cast a slightly different light on events.

Following the establishment of a data policy at Climatic Change, I requested data from Gordon Jacoby – which led to the “few good men” explanation of non-archiving (see CA in early 2005) and from Lonnie Thompson (leading to the first archiving of any information from Dunde, Guliya and Dasuopu, if only summary 10-year data inconsistent with other versions.) Here Schneider accomplished something that almost no one else has been able to do – get data from Lonnie Thompson, something that, in itself, shows Schneider’s stature in the field.

It was very disappointing to read Schneider’s description of these fairly genial exchanges in his book last year. Schneider stated:

The National Science Foundation has asserted that scientists are not required to present their personal computer codes to peer reviewers and critics, recognizing how much that would inhibit scientific practice.

A serial abuser of legalistic attacks was Stephen McIntyre a statistician who had worked in Canada for a mining company. I had had a similar experience with McIntyre when he demanded that Michael Mann and colleagues publish all their computer codes for peer-reviewed papers previously published in Climatic Change. The journal’s editorial board supported the view that the replication efforts do not extend to personal computer codes with all their undocumented subroutines. It’s an intellectual property issue as well as a major drain on scientists’ productivity, an opinion with which the National Science Foundation concurred, as mentioned.

This was untrue in important particulars and a very unfair account of our 2004 exchange. At the time, Schneider did not express any hint that the exchange was unreasonable. Indeed, the exchange had the positive outcome of Climatic Change adopting data archiving policies for the first time.

As I noted above, at his best, Schneider was engaging and cheerful – qualities that I prefer to remember him by. I was unaware of his personal battles or that he ironically described himself as “The Patient from Hell” – a title that seems an honorable one.

Read more at Climate Audit

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
179 Comments
Inline Feedbacks
View all comments
Joe Lalonde
July 21, 2010 5:35 am

Proxies(gotta love that word!) are used in place of having the actual data and are aquired in many different areas and forms. Essencially theories to back up the data.
Does the “normal” public aware that many areas use proxies and how they were created? How proxies effect the overall final numbers?
Science is following the actual evidence. Not follow the theroy to crate an outcome.
If Anthony had the magic money wand and said “I am giving grants to any scientific theories that are credible”, the different ideas and different mindsets would show how each of us think differently, yet other influences and ideas from others seems to alter the final outcome when allowed to follow the scientific trail. What we have acquired today in knowledge is not the same as 10 or 20 years ago in our own path of thought.

Richard111
July 21, 2010 5:39 am

If “personal computer codes with all their undocumented subroutines” are not cleaned up and offered with the data how will the author ever learn from any mistakes?

RockyRoad
July 21, 2010 5:43 am

John Egan says:
July 21, 2010 at 5:05 am
Even if there is profound disagreement with someone –
To publish something like this a day after a person dies is unseemly.
It should be removed.
————-Reply:
Nope. While I respect the memory of Stephen Schneider (from the perspective of a fellow human that will meet the same fate in the near future), his passing will either elevate his work or cause it to be evaluated realistically. Besides, I’m betting if all the above comments were glowing reviews, you’d have no problem with it. That’s a double standard.

Yarmy
July 21, 2010 5:46 am

David says:
July 21, 2010 at 5:01 am
The graph you used is cherry picked and out of context. It came from Loehle E & E in a paper called “A 2000-Year Global Temperature Reconstruction Based on Non-Tree-ring Proxies.”

The url of the graph is irrelevant: it’s just the first one google spat out when I went looking for it. The point is that – contrary to JCrabb’s assertion – the Loehle reconstruction does not show that current temps are unprecedented in 1000 years.

Andrew Zalotocky
July 21, 2010 5:54 am

Andrew30 says:
“If you can read the language you can easily understand the function”
If you can read the language you should be able to work out what the function is doing but it won’t be easy if the code is an “idiosyncratic and uncommented” mess. More importantly, the fact that you can understand what it’s doing doesn’t necessarily mean you can understand why it’s doing it. Comments are necessary to explain the purpose of each part of the code and the assumptions behind it. If you don’t have that information it is impossible to be sure if the code is actually doing what it was intended to do.
The act of writing comments also forces you to make your assumptions explicit, thus making it more likely that any flaws in the methodology will be spotted at that stage. It ensures that you will have this information to hand if you have to modify the code in future, as you will probably have forgotten some or all of it by then. Comments are an integral part of good code.

Richard Tol
July 21, 2010 5:57 am

Steve Schneider was a great man.
Reproducibility is a cornerstone of science. If that requires publishing your code, so be it.

Kate
July 21, 2010 5:58 am

tallbloke at 12:05 am
“…May Schneider rest in peace, let’s bury the defunct peer review system with him…”
This is the real world of science, not the washed and scrubbed corporate media version.
Editors and scientists alike insist on the pivotal importance of peer review. The mistake is to have thought that peer review was anything more than a crude means of discovering the acceptability — not the validity — of a new finding.
The big lie is to portray peer review to the public as a quasi-sacred process that helps to make science our most objective truth teller. The system of peer review is biased, unjust, unaccountable, incomplete, easily fixed, often insulting, usually ignorant, occasionally foolish, and frequently wrong.

July 21, 2010 5:59 am

Chris Long says:
July 21, 2010 at 3:54 am
Breathtaking post from “toby”.

I am, like you, a professional programmer, and everything that you said is wrong because it is from the point of view of a programmer tasked with producing commercial code (whether for sale or use within a business). Science is very, very different.

Point 1 – as has been pointed out, if you’re writing ‘idiosyncratic’ poorly-commented code, then it’s likely to be wrong.

No, that’s not at all true. There is a lot of code out in the world that works perfectly well, and is almost untouchable because it’s so convoluted. In particular, code that evolves (as almost must be the case in a research project), and in a situation where the lifetime of the code is going to be short lived (i.e. ancillary to one short research project, not a system with a lifespan of five to fifteen years), is not going to be robust from a programmer’s point of view. The point in that situation is to get to the answer with the least possible effort, not to produce “great code.”
And before you go on about how better code saves time in the long run, if you think that then you’re not listening and appreciating the difference in the situation. Commercial code has a clear objective from the start. It has an important goal of being stable and maintainable. Not so with research.

Point 2 – …Another programmer will be able to read your code, even if it takes longer than it should…

But it’s not going to be read by a programmer, its going to be read by other scientists who only use programming as much as they need to. Scientists work in science, and use programming as a tool. Again, it’s a whole different world than you are used to, and forcing your own personal experience and paradigm into a different environment, then declaring other people wrong, is not valid.

(a) take some programming lessons

This is just arrogant, and obnoxious. Just about everyone here should take science lessons before they post, but it doesn’t stop an obscene amount of arrogance, flavored with ignorance, from polluting the blogosphere. Scientists work hard at what they do, and put their time where best needed. You need to take some science lessons before making a comment like that.

Point 3 – If you don’t release code, then for replication to be possible, you *must* provide a fully-detailed functional specification of your code.

No, and the reason you don’t understand this is because you don’t understand either science or the peer-review system. The point is not for someone else to run the same program on the same data and get the same result. The point is not for someone to go through a program with a fine tooth comb looking for logic errors.
The way science works is that someone should be able to take the same hypothesis, and the same readily available data, and completely on their own perform an experiment that yields results consistent with the original study. If it doesn’t, there’s doubt about the hypothesis, and its left to science to determine how the two studies differed and obtained different results. But this is how you test a hypothesis, not by demanding detailed notes. It would be like walking into Madam Curie’s lab and insisting that you use her lab equipment to repeat her experiments.
A major failing of every bystander in the issue of climate change is to apply their own personal, limited experiences and training to the problem, and then to believe that they, over everyone else, understand it best. There are far too many intelligent but ill-equipped engineers posting with hubris their confident (and wrong) assessment of climate science.
One would think that intelligent people would realize that the first step in attacking a new problem is research and education, and well beyond what people here bother to undertake. As an engineer, do you apply everything you did on the last project to the next, or do you learn what’s needed for the new project? Do you take an assignment in Python, but insist on trying to write Python code using Java techniques, then complain when things don’t work right that it’s because Python is a bad language?
One would also think that intelligent people would keep an open mind, and assume that if there’s something that doesn’t look right, that maybe the problem is a gap in their own education and experience, and not with the people who have spent a lifetime training for and working on a problem.
Note to world: study, learn, keep an open mind, and try to realize that arrogance and unjustified self confidence are failings, not virtues.

Ken Harvey
July 21, 2010 6:00 am

If Einstein had said “M = mc2, but I am not going to tell you what the symbols stand for nor how I got there”, who would have taken any notice?.

July 21, 2010 6:07 am

Quinn the Eskimo says:
July 21, 2010 at 4:18 am
Worrying about intellectual property rights in computer code developed for an academic paper is revealing.
IP rights matter when you are trying to make money from the creation.

Spoken like a true capitalist, and very, very wrong.
Not everything in the world is measured in money. Obviously a scientist is trying to build a reputation and advance his career. Many scientists have worked on the same problem at the same time, basically in a race to be first to the end result. Competition occurs everywhere in human society, not just in financial ventures. It occurs in science.
This crazy idea that somehow all scientists working on the same problem should be part of one, big, happy team is ludicrous. People are entitled to their own intellectual property, and they’re entitled to try to be “the one” to publish that next great ground breaking study. And if some of the information from their last paper is serving as the foundation for the next, then no, they do not and should not have to share it.
It would be like insisting that all businesses share all information, because the ultimate goal is a stronger economy for the country. Someone tried that once, and it failed miserably (see the U.S.S.R. and communism).

red432
July 21, 2010 6:16 am

peakbear says:
“””
I almost dread to say it but what a lot of research institutions need is some decent project management to enable better efficiency. I know this goes against the grain from what a traditional Phd/PostDoc is (working individually) working together would allow much better work to be done and would also be more rewarding.
“””
In my experience you could use some psychologists and mediators also — I’ve seen large “simulations” built in bits by groups of “scientists” that hated each other and could stand being in the same room together. Needless to say the results of the simulation were astounding.

July 21, 2010 6:16 am

Slightly off-topic, but it always grates on me to read about ‘computer codes’. Where did this idea come from? Code, in the context of programming source code, is an aggregate noun and does not require a plural form. You wouldn’t say that a beach contains thousands of tons of sands.
A single computer program is compiled from ‘code’; a suite of computer programs is also compiled from ‘code’. A large project would have hundreds of thousands of lines of ‘code’. The Climategate archive contained a lot of ‘code’.
Does anyone know what grammatical rules people are using when they decide to talk about ‘computer codes’, plural?

Ryan
July 21, 2010 6:29 am

: I get really fed up with the nonsense that seeks to protect climate scientists on the basis that they are somehow super-intelligent humans and the rest of us are not worthy to challenge their output. In passing I remember the invention of the atom bomb, the cover up of BSE and the release of Thalidomide as rather obvious examples where science outdid itself in releasing the product of their hyper-intelligence on the semi-retarded masses that simply couldn’t cope with their supreme creativity.
I, for one, only have an IQ of 137. I couldn’t POSSIBLY be expected to understand how a thermometer is used to measure temperature over long periods of time, nor to understand that perhaps some trees grow better in certain climates producing a higher density of tree rings. After all, I gave up the possibility of a lifetime doing research for a doctorate at a third rate university such as the University of East Anglia to make MONEY in a high-tech industry. What a fool! I could have stood shoulder to shoulder with such giants as Dr Jones.
Naturally I, along with all my fellow dimwits, should volunteer for the gas chambers so that planet earth can be left to be the playground of such scientific luminaries as Al Gore, who will regale the children of this new Utopia with stories of an Earth’s core at millions of degrees Celcius, unchallenged by those of us that know no better.

Scott B
July 21, 2010 6:39 am

Frank says:
July 21, 2010 at 3:00 am
“Rick Bradford (July 21, 2010 at 12:54 am) took Schneider’s quote (see below) about telling scary stories out of context, a gross thing to do right now. When dealing with McIntyre and Mann, did Schneider succeed in keeping his scientific and public ethics from interfering? McIntyre appears to think he tried.”
How is that quote out of context?
Schneider wanted to wear two hats, one as scientist, one as activist. But the only reason anyone took his opionion seriously was because they thought, as a scientist, he would stick to the facts. Instead, he used his scientist hat to “trick” people (trick, being a good way to get people to believe something that is not proven) into giving more weight to his activist views than they deserved.
Did he ever said “Ok, now I’m wearing my activist hat, so while what I say is not supported by the science, I think it eventually will be, and is too important to let the facts prevent us from taking action anyway?” I doubt it. That would ruin the trick.

vigilantfish
July 21, 2010 6:41 am

Frank says:
July 21, 2010 at 3:00 am
Rick Bradford (July 21, 2010 at 12:54 am) took Schneider’s quote (see below) about telling scary stories out of context, a gross thing to do right now.
———————
Most of us here are familiar with the longer version of the quote. By including the preamble of the high-sounding justification for choosing between honesty and efficacy, the ends justifies the means? “And like most people we’d like to see the world a better place, which in this context translates into our working to reduce the risk of potentially disastrous climatic change. To do that we need to get some broadbased support, to capture the public’s imagination. That, of course, entails getting loads of media coverage. So we have to offer up scary scenarios, make simplified, dramatic statements, and make little mention of any doubts we might have”. Most of us would like to see the world a better place, but it is totalitarian to assume that one’s personal– or a specific interest group’s–version of what is ‘better’ should be imposed on others through the use of distortions. That, unfortunately, is the legacy of Schneider, a ‘science’ that serves a totalitarian agenda.

July 21, 2010 6:41 am

sphaerica,
You leave a vitally important element out of all your comments: replication.
You try to cover that omission by talking about intellectual property rights, but that is a red herring argument any time there is public money involved — and in the climate sciences, public money is always involved.
Without replication, falsification is extremely difficult. McIntyre was able to reverse engineer Mann’s fraudulent hokey stick through exceptional diligence, and because of some oversights by Mann.
But to this day, 12 years after MBH98, Michael Mann still refuses to publicly archive his code and methodologies — which were paid for by millions of taxpayers. His work product is not Michael Mann’s personal intellectual property. It belongs to the people who paid for it.
People like Schneider gave Michael Mann a free pass to unethically withhold the details of his work, by saying it’s OK to tell half-truths [which as we know are whole lies] in order to advance their agenda.
Refusal to cooperate in replication means that we are expected to take Mann’s word. But “trust me” is not an answer when so much is at stake. Since replication is an essential part of the scientific method, what Mann is doing is not science. It is anti-science propaganda, and enabling that propaganda is a universal tactic of the alarmist contingent. If the alarmist crowd started to honestly work within the scientific method, the whole CO2=CAGW edifice would promptly come tumbling down like the house of cards that it is.

John Whitman
July 21, 2010 6:46 am

With all due respect for Prof Schneider, he was a very public arena centered scientist. With that goes all the positives and negatives of public figures. At his unfortunate passing it seems fitting to reflect on all aspects of his prominently public life. My emphasis is on all. I further suggest to do negative aspects in good taste.
John

July 21, 2010 6:55 am

sphaerica says:
July 21, 2010 at 5:59 am
Chris Long says:
July 21, 2010 at 3:54 am
Breathtaking post from “toby”.
I am, like you, a professional programmer, and everything that you said is wrong because it is from the point of view of a programmer tasked with producing commercial code (whether for sale or use within a business). Science is very, very different.

I understand the points you’re making, but I (clearly) disagree. Of course, programs written to support a single paper do not need to meet the same maintainability and quality standards that commercial software does. My own work involves writing a lot of one-off programs (analysing clinical trials data) but that does not mean that it’s ok for them to contain errors or to be incomprehensible.

But it’s not going to be read by a programmer, its going to be read by other scientists who only use programming as much as they need to.

You seem to be saying that it’s ok to write incomprehensible code because nobody else is going to try to understand it anyway. That doesn’t fill me with confidence, and is why I (somewhat flippantly) encouraged Toby to educate himself on the benefits of well-documented code.

No, and the reason you don’t understand this is because you don’t understand either science or the peer-review system. The point is not for someone else to run the same program on the same data and get the same result. … The way science works is that someone should be able to take the same hypothesis, and the same readily available data, and completely on their own perform an experiment that yields results consistent with the original study. If it doesn’t, there’s doubt about the hypothesis, and its left to science to determine how the two studies differed and obtained different results.

No, I think you missed the point. Of course, in an ideal world, the replication of a study would involve writing brand new code to implement that same processes, thus confirming the results – that’s what we do in my shop: dual programming. However, for a paper to replicable, it has to accurately describe its methods – you can’t just say ‘we took the data and ran it through our secret algorithm’, nor can you just say ‘we took the data and applied a reverse time-independent wibbleflab test’. Many climate science papers do not adequately describe their methods, as Steve McIntyre has amply documented, and as Phil Jones has admitted. It would be like Marie Curie publishing on the radioactivity of uranium without describing the electrometer apparatus used to make the measurements.
It would be extremely easy for researchers to accurately document their methods by simply releasing their code. The source code developed for the paper forms an ideal description of the methods (assumptions, mistakes and all). Researchers’ reluctance to release code is a major problem, and your post, while giving me a few valuable pointers on how science works, does not address this.
This seems to be a very common misunderstanding in the ‘release your code’ debate. The point is not that people want to run exactly the same code on the same data; nor is it that people want to pick holes in researchers’ programming techniques. The point is simply that the code, which already exists and is sitting on a machine somewhere, provides an totally complete and unambiguous description of what was actually done with the data, at no extra cost to the researchers.
I would be interested to know whether you can provide a good reason for not providing either (a) the code or (b) an equally full description of the methods.
As a follow-up question – in your opinion, if a paper relying on computer data processing does *not* adequately describe the algorithms used (by whatever method) and thus anyone trying diligently to replicate the paper’s results is thwarted, should the original paper be disregarded?

DennisA
July 21, 2010 6:57 am

Perhaps we should remember the e-mail from Phil Jones, who said he was “cheered” in a strange way, by news of the sudden death of John L. Daly, who died of a heart attack in 2004.
There is a lot of insight into Professor Schneider’s attitude to science, the public and the media at http://stephenschneider.stanford.edu/Mediarology/MediarologyFrameset.html?http://stephenschneider.stanford.edu/Mediarology/Mediarology.html
Masses of links….

Scott B
July 21, 2010 7:02 am

sphaerica says:
July 21, 2010 at 6:07 am – “It would be like insisting that all businesses share all information, because the ultimate goal is a stronger economy for the country.”
No, it would be like requiring all publically traded companies to provide audited financial statements, so we could know that their claims of profits and growth are legitimate. That way, no one gets ripped off by investing with a company that is misrepresenting itself. And we absolutely require that, because all public companies greatly benefit from appearing to be successful, in both increased investments and prestige.
As you point out, scientists have the same motivation to appear successful. But if we are unable to confirm their findings, how can we know these findings are legitimate? Further, as you said, what if “some of the information from their last paper is serving as the foundation for the next?” Wouldn’t it be possible for a scientist to appear to be very successful, publishing several papers, each building on the last, solving many difficult problems without anyone ever being able to confirm the results?
It seems like this is exactly what happened with the hockey stick and all of its spin offs. A lot of garbage data was massaged with the programming to appear to deliver a telling message. But since no one could review the programming, no one could confirm or deny the results. Then Mann and other scientists built upon the hockey stick data and code, all coming to “better and better” conclusions, without anyone being able to confirm that dendrochronology was actually a useful measure of temperature.

vigilantfish
July 21, 2010 7:05 am

sphaerica says:
July 21, 2010 at 6:07 am
The way science works is that someone should be able to take the same hypothesis, and the same readily available data, and completely on their own perform an experiment that yields results consistent with the original study. If it doesn’t, there’s doubt about the hypothesis, and its left to science to determine how the two studies differed and obtained different results. But this is how you test a hypothesis, not by demanding detailed notes. It would be like walking into Madam Curie’s lab and insisting that you use her lab equipment to repeat her experiments.
—————–
As has been argued here and at Climate Audit and at many other websites many times, the data is not readily available. Why else did Steven McIntyre have to battle for years to get the data. You also don’t understand how science works. There have been many instances in which scientists could not replicate results because descriptions of equipment and methods do not exactly recapture what procedures were used. Some scientific inquiries during scientific disputes have required other scientists to come into a lab to observe the original scientists at work – occasionally the scientists involved in making a claim unconsciously perform some manoeuvre that turns out to be critical to the experiment. There are cases in which perfectly valid results cannot be replicated until scientists at other institutions learn the exact techniques used by observation of those who achieve the original breakthrough.
For example, as John C. Baillar III noted in “The Role of Data Access in Scientific Replication”, a paper presented to ” Access to Research Data: Risks and Opportunities. Committee on National Statistics, National Academy of Sciences”:
In chemistry, I recall hearing about a specific, real, case of a new chemical synthesis in which, spite the best efforts of the originator and the replicator, the latter could not get a specific synthesis to work — until the two of them followed the protocol side by side and found that the critical difference was in whether a metal stirring rod touched the side of the beaker during mixing.
Canadian fisheries biologists, who created the foundation of a thriving but embattled fish farming industry in the Bay of Fundy and in British Columbia, attempted to set up fish culture operations for years, based on descriptions published by successful Norwegian scientists. It was not until Canadian scientists spent a season in Norway observing (rather than reading about) the techniques and technologies involved that Canadian fish farming actually began to succeed.
Therefore if scientists had not been able to recreate Mme. Curie’s experiments just using her written accounts, it would have been quite reasonable to request a direct demonstration from her, or to use her laboratory equipment in the attempt. Likewise, since there is no way on earth that climate ‘science’ can be properly audited without some knowledge of how the data was manipulated, by a provision of algorithms or code, the requests for information by McIntyre and other individuals who have a real respect for scientific understanding should be universally accepted as falling within sound scientific practice. And not just accepted, but actively supported by scientists everywhere!

Andrew30
July 21, 2010 7:08 am

Andrew Zalotocky says: July 21, 2010 at 5:54 am
“More importantly, the fact that you can understand what it’s doing doesn’t necessarily mean you can understand why it’s doing it”
The paper explains the thinking and the why, the formula or program is the what and the how.
Comments are like the pictures in a childrens book, they are not the story, they are only an illustration. Comments are often not up to date and almost never corrected once they are entered. Computers do not run comments.
Anyone who debugs comments is doomed to fail, debug only code.

July 21, 2010 7:10 am

sphaerica says:
July 21, 2010 at 5:59 am
Chris Long says:
July 21, 2010 at 3:54 am
Breathtaking post from “toby”.
I am, like you, a professional programmer, and everything that you said is wrong because it is from the point of view of a programmer tasked with producing commercial code (whether for sale or use within a business). Science is very, very different.

I understand the points you’re making, but I (clearly) disagree. Of course, programs written to support a single paper do not need to meet the same maintainability and quality standards that commercial software does. My own work involves writing a lot of one-off programs (analysing clinical trials data) but that does not mean that it’s ok for them to contain errors or to be incomprehensible.

But it’s not going to be read by a programmer, its going to be read by other scientists who only use programming as much as they need to.

You seem to be saying that it’s ok to write incomprehensible code because nobody else is going to try to understand it anyway. That doesn’t fill me with confidence, and is why I (somewhat flippantly) encouraged Toby to educate himself on the benefits of well-documented code.

No, and the reason you don’t understand this is because you don’t understand either science or the peer-review system. The point is not for someone else to run the same program on the same data and get the same result. … The way science works is that someone should be able to take the same hypothesis, and the same readily available data, and completely on their own perform an experiment that yields results consistent with the original study. If it doesn’t, there’s doubt about the hypothesis, and its left to science to determine how the two studies differed and obtained different results.

No, I think you missed the point. Of course, in an ideal world, the replication of a study would involve writing brand new code to implement that same processes, thus confirming the results – that’s what we do in my shop: dual programming. However, for a paper to replicable, it has to accurately describe its methods – you can’t just say ‘we took the data and ran it through our secret algorithm’, nor can you just say ‘we took the data and applied a reverse time-independent wibbleflab test’. Many climate science papers do not adequately describe their methods, as Steve McIntyre has amply documented, and as Phil Jones has admitted. It would be like Marie Curie publishing on the radioactivity of uranium without describing the electrometer apparatus used to make the measurements.
It would be extremely easy for researchers to accurately document their methods by simply releasing their code. The source code developed for the paper forms an ideal description of the methods (assumptions, mistakes and all). Researchers’ reluctance to release code is a major problem, and your post, while giving me a few valuable pointers on how science works, does not address this.
This seems to be a very common misunderstanding in the ‘release your code’ debate. The point is not that people want to run exactly the same code on the same data; nor is it that people want to pick holes in researchers’ programming techniques. The point is simply that the code, which already exists and is sitting on a machine somewhere, provides an totally complete and unambiguous description of what was actually done with the data, at no extra cost to the researchers.
I would be interested to know whether you can provide a good reason for not providing either (a) the code or (b) an equally full description of the methods.
As a follow-up question – in your opinion, if a paper relying on computer data processing does *not* adequately describe the algorithms used (by whatever method) and thus anyone trying diligently to replicate the paper’s results is thwarted, should the original paper be disregarded?

vigilantfish
July 21, 2010 7:15 am

sphaerica says:
July 21, 2010 at 6:07 am
People are entitled to their own intellectual property, and they’re entitled to try to be “the one” to publish that next great ground breaking study. And if some of the information from their last paper is serving as the foundation for the next, then no, they do not and should not have to share it.
————–
Ahh, yes, if only John von Neumann had not leaked the paper that revealed how to make computers, and the blueprints for computer architecture, John Presper Eckert and John Mauchly could have been computer billionaires, and we’d all be paying royalties to their estates. So what if the work was 100% financed by tax-payers?
The question of intellectual property is simple only if one is privately funded. Corporate scientists have no intellectual property rights that are independent of the corporate interest they serve, so why should tax-funded scientists be able to lord it over us, tell us how to live our lives, serve up scary stories to get more funding, and then refuse to turn over the basic data and algorithms? Sure there are no problems if the science remains esoteric and removed from political and moral questions, but when a public agenda is being backed up by science, nothing should be secret. Something’s a little askew in your universe.

Nuke
July 21, 2010 7:22 am

The National Science Foundation has asserted that scientists are not required to present their personal computer codes to peer reviewers and critics, recognizing how much that would inhibit scientific practice.

Whoa!