McIntyre on Stephen Schneider

An excerpt from Steve’s post at Climate Audit

Schneider replied that he had been editor of Climatic Change for 28 years and, during that time, nobody had ever requested supporting data, let alone source code, and he therefore required a policy from his editorial board approving his requesting such information from an author. He observed that he would not be able to get reviewers if they were required to examine supporting data and source code. I replied that I was not suggesting that he make that a condition of all reviews, but that I wished to examine such supporting information as part of my review, was willing to do so in my specific case (and wanted to do so under the circumstances) and asked him to seek approval from his editorial board if that was required.

This episode became an important component of Climategate emails in the first half of 2004. As it turned out (though it was not a point that I thought about at the time), both Phil Jones and Ben Santer were on the editorial board of Climatic Change. Some members of the editorial board (e.g. Pfister) thought that it would be a good idea to require Mann to provide supporting code as well as data. But both Jones and Santer lobbied hard and prevailed on code, but not data. They defeated any requirement that Mann supply source code, but Schneider did adopt a policy requiring authors to supply supporting data.

I therefore re-iterated my request as a reviewer for supporting data – including the residuals that Climategate letters show that Mann had supplied to CRU (described as his “dirty laundry”). The requested supporting data was not supplied by Mann and his coauthors and I accordingly submitted a review to Climatic Change, observing that Mann et al had flouted the new policy on providing supporting data. The submission was not published. I observed on another occasion that Jones and Mann (2004) contained a statement slagging us, based on a check-kiting citation to this rejected article.

During this exchange, I attempted to write thoughtfully to Schneider about processes of due diligence, drawing on my own experience and on Ross’ experience in econometrics. The correspondence was fairly lengthy; Schneider’s responses were chatty and cordial and he seemed fairly engaged, though the Climategate emails of the period perhaps cast a slightly different light on events.

Following the establishment of a data policy at Climatic Change, I requested data from Gordon Jacoby – which led to the “few good men” explanation of non-archiving (see CA in early 2005) and from Lonnie Thompson (leading to the first archiving of any information from Dunde, Guliya and Dasuopu, if only summary 10-year data inconsistent with other versions.) Here Schneider accomplished something that almost no one else has been able to do – get data from Lonnie Thompson, something that, in itself, shows Schneider’s stature in the field.

It was very disappointing to read Schneider’s description of these fairly genial exchanges in his book last year. Schneider stated:

The National Science Foundation has asserted that scientists are not required to present their personal computer codes to peer reviewers and critics, recognizing how much that would inhibit scientific practice.

A serial abuser of legalistic attacks was Stephen McIntyre a statistician who had worked in Canada for a mining company. I had had a similar experience with McIntyre when he demanded that Michael Mann and colleagues publish all their computer codes for peer-reviewed papers previously published in Climatic Change. The journal’s editorial board supported the view that the replication efforts do not extend to personal computer codes with all their undocumented subroutines. It’s an intellectual property issue as well as a major drain on scientists’ productivity, an opinion with which the National Science Foundation concurred, as mentioned.

This was untrue in important particulars and a very unfair account of our 2004 exchange. At the time, Schneider did not express any hint that the exchange was unreasonable. Indeed, the exchange had the positive outcome of Climatic Change adopting data archiving policies for the first time.

…

As I noted above, at his best, Schneider was engaging and cheerful – qualities that I prefer to remember him by. I was unaware of his personal battles or that he ironically described himself as “The Patient from Hell” – a title that seems an honorable one.

Read more at Climate Audit

0 0 votes

Article Rating

179 Comments

Inline Feedbacks

View all comments

Jeff

July 21, 2010 9:13 am

as background I have spent much of the last 18 years testing input data being run thru software and looking at the output data and validating that the code did exactly what it was supposed to do … not having either the actual code or a detailed description of exactly what the code has done in each and every possible scenario would cause me to fail a program without running single test … I would have nothing to test … and in my world untested = failed …
No, I don’t need the code but without a detailed description of how the code acts on the data I would be unable to “manually” replicate its behavior to validate its output …
It is simply beyond belief that anyone could think the code is not what is being tested by reviewers …

Kate

July 21, 2010 9:26 am

k winterkorn says at 8:31 am
“Stephen Schneider…His jump to Global Cooling in the 1970′s, and then his turnabout jump to Global Warming shows either a lack of scientific discipline in his thinking or fraud in his public presentation of his beliefs, or some of both.”
…Stephen Schneider has been around a long time, and by the time the great “new ice age” scare was over, he knew his way around the political/academic/scientific grant-giving government machine like it was his own living room. Stephen Schneider wanted fame, recognition, and his bills paid, and he saw his big chance when the global warming rocket was about to take off, so he rushed on board, dumping all his previous work on global cooling as fast as possible. No doubt he was hoping that nobody would notice, or if they did notice they wouldn’t care enough to make a big deal about it.

john a

July 21, 2010 9:27 am

re: jcrabb’s assertion that no one has said that the late 20th century rate of temperature change has been seen before in the last 1,000 years…
Monckton’s presentations site several since the middle ages–it’s one of his central points, that temperature can change abruptly under natural conditions.

George E. Smith

July 21, 2010 9:31 am

“”” jcrabb says:
July 21, 2010 at 12:22 am
For all the criticism of Global temperature reconstructions, no one has created a current Global reconstruction showing the current temperature rise to be within ‘normal’ parameters over the last 1000 years, all Global reconstructions, including even Loehles show current Global temps being the highest for over a thousand years. “””
That’s an odd conclusion; a very odd conclusion.
We are talking about the globally averaged; over some significant period of time; of a variable that on any given day can range within an extreme range of over 150 deg C; and some people suggest as much as a 180 deg C range; a variable for which credible data for almost 3/4 of the planetary total simply is not available; well if you take the remote “ground” areas of the world, that is over 3/4 of the total area; and the area that has had some sort of credible monitoring has not even been followed with any consistency over the last 150 years. Stations are added, or subtracted; moved to new locations; and the purveyors of what limited measurement has been done claim measurable differences in hundredths of a deg and decadal changes of maybe a tenth of a degree; and you believe that no period in the last 1000 years has been warmer than the present.
I assume that you are fully aware of the concept of ” 1/f noise ” ; a very common source of random noise in many ordinary physical systems; where the observed amplitude of random fluctuations can range without limit with a spectral distibution such that the observed instantaneous noise amplitude grows as the inverse of the frequency of occurrence. 1/f noise is present in all electronic signal processing systems; and the 1/f character of low frequency noise has been confirmed down to as low a frequency as anybody has ever cared to sepnd the time observing. The growth without limit does not violate any energy laws, since events of higher power occur with ever diminishing frequency; so they are spread out over an increasing period of time. It is a trivial exercise to prove that 1/f noise contains an equal noise power in each octave of frequency range.
So it is quite unlikely that short (<30 years) intervals of warmer (or colder) Temperatures than have been observed recently, have ever been surpassed in the last 1000 years. The recent period of "global warming" itself is less than 30 years from the mid 1970s to the mid 1990s; and it has been on a down slope for the last 10 to 15 years or so.
And you can't get away with that sleight of hand trick; saying that no reconstruction proves that recent warmth is within natural variability.
The burden of proof that it is NOT natural rests on those who would argue that it is NOT natural.
There's plenty of documented anecdotal history of warmer epochs within that 1000 year range; and why limit it to 1000 years; what about the last 10,000 years ?
Quite apart from the noise spectrum aspects of temperature fluctuations; there's the whole problem of measurement rigor; which doesn't even come close to satisfying known governing laws and principles of information theory or sampled data system theory.
And in the end; the matching up of global and historic CO2 data (all the way back to the IGY in 1957/8) to the temperature fluctuations is even worse than our knowledge of either variable.
There's not anyone on this planet who can even tell us definitively what is the correct time delay to use in Stephen Schneider's "Climate Sensitivity" equation:-
T2 -T1 = (cs).log (CO2,2/CO2,1)
Here T2 and T1 are two Mean Global Surface Temperature values (for over some time intervals) and CO2,2 and CO2,1 are atmospheric CO2 relative molecular abundances at two time periods; not necessarily coincident with the Temperature epochs.
Purported data has tried to relate those two variables with time delays between the Temperature, and the CO2 observations that can be anywhere over a +/- 1000 year or more.
Nobel Laureate Al Gore has published data in his famous book showing that the best correlation for any moves of Temperature and CO2 occurs for about an 800 year delay between Temperature changes, and atmospheric CO2 changes; yet he insists that the CO2 changes are what caused the Temeprature changes 800 years earlier.
Right now we are just 800 years delayed from the so-called mediaeval warm period; that history shows was warmer than today; and we now are in the midst of the rising CO2 abundance; that apparently caused the mediaeval warming period (well according to Al Gore) and he's a Nobel Laureate; so he should know.
You've got the shoe on the wrong foot, jcrabb. It's up to the Deniers (of natural cause) to prove it is NOT natural cause.

Nuke

July 21, 2010 9:32 am

Dave Springer says:
July 21, 2010 at 8:18 am
@Nuke
“Computer code is deterministic. Run the same code with the same data and the results are always the same.”
If identical results from identical input is a design goal then I’d agree with that in principle but there are a great many possible pitfalls in reality. This is especially true when floating point calculations are involved and the underlying hardware, firmware, and ancillary software is not identical.

I think we are in agreement here. You’re talking about changes in the runtime environment. These are variables which must be controlled and accounted for.
It’s possible to get one results from environment A and different results when moving to environment B. But if you run the same code and same data repeatedly on A, then the same results are expected each run. Run on B with the same code and data and the results may not match A. But each run on B should have the same results.

EthicallyCivil

July 21, 2010 9:32 am

On the publication of source code, Appendix A of my Masters thesis is the solver I wrote based on the equations developed in the thesis and the unit test of the “closed form” solution test problems.
The results section of the thesis was predictive modelling of the novel aerodynamic device we were building. Compared to climate modelling, it was a simple problem. but it had design impacts on the experimental work we were doing. My advisers wouldn’t accept the results without the source code and test code, why should we expect less of those whose conclusions are driving an overhaul of the global enconomy.

Bernie

July 21, 2010 9:33 am

sphaerica:
Your comments miss the point. To replicate many data analysis results in climate science, the actual analysis needs to be available. It is as simple as that. All the issues about commented code, convoluted code, etc., are secondary. The IP defence or trade secret defence is another red herring. Lastly, the CRU emails clearly indicate the primary motivations among those using idiosyncratic statistical procedures to prepare and analyze the raw data for denying access to both data and code.

Nuke

July 21, 2010 9:35 am

Alan F says:
July 21, 2010 at 8:33 am
Nuke,
Given the end result wished, I can write in f77 (still have all my notes even) as many nested procedures as is required to accomplish such from any set of numbers and so can ANY other schooled in the late 80′s IT dweeb. The bad bit here appears to be any academical coding requires little if any outside debugging and certainly scarce documentation while industrial coding (my job) is checked here, back in Germany and once again here in good old Cannuckville before it makes its way into any systems or machinery. Bad code in information systems costs coin, in machinery lives and neither is allowable in business. Why should the script kiddies in Academia, whose code is setting the stage for %GDP level spending be afforded any less vigilance?

Absolutely agree. Are there no Software Engineering or Computer Science grad students available to assist with climate research?

Robert E. Phelan

Editor

July 21, 2010 9:37 am

Modern programming and scripting languages (notably anything labeled VISUAL) use ridiculously-long-variable-names-that-are-intended-to-be-self-documenting. Older languages of the sort I spent a quarter century working with as a business/manufacturing programmer tended to be a lot more terse – and variables and arrays often needed to be re-used – especially in constructions like
FOR Q=1 TO 20
READ (Q) A$, B$, C$
GOSUB nnnn
NEXT Q
or even worse….
Q$=”FILE1FILE2FILE3FILE4FILE5″
FOR I=2 TO LEN(Q$)/5
CLOSE(1)
OPEN(1) Q$((((I-1)*5)+1),5)
READ (1) A$,B$,C$
GOSUB nnnn
NEXT I
REM There is are two bugs in this program. Where are they?
Comments were often critical. My programs were always heavily commented, which cost little since comments are ignored when compiled for execution. The number of times I’ve had to de-bug someone else’s code and was faced with an undocumented GOSUB and was left wondering “just what the hell was THAT all about?”… sometimes you can only figure it out by stepping through the program with various starting values… and voila! it works fine EXCEPT in this one instance where a calendar month has two full moons….

Reed Coray

July 21, 2010 9:42 am

Shortly before his death, Dr. Stephen Schneider gave the Stanford Magazine an interview which was reported in their July/August 2010 issue. I would like to comment on a few of Dr. Schneider’s remarks.
The primary lasting impact will be that it has delayed climate policy by a year or two—which, if the Congress tips away from Democrats, could delay it by eight or more.”
Well at last we have a concrete example of a “climate change” tipping point–the change from a Democrat controlled Congress to a Republican controlled contest.
When Dr. Schneider was asked the question: “Why do you think it’s wrong to give equal public consideration to most climate-change dissenters?” in part his response was:
“It is completely appropriate in covering two-party politics, if you [cover] the Democrat to [cover] the Republican. In fact, if somebody didn’t do that, they would not [be considered] fair and balanced. It is completely inappropriate, if there’s an announcement of the new cancer drug for pediatric leukemia [with] a panel of three doctors from various hospitals, to then give equal time to the president of the herbalist society, who says that modern medicine is a crock. They wouldn’t even put that person on the air, so why put on petroleum geologists—who know as much about climate as we climatologists know about drilling for oil—because they’ve studied one climate change a hundred million years ago?”
Why is it inappropriate? Inappropriate to whom? It may be a waste of time. It may be silly. It may even be dumb. But it’s a stretch to call it “completely inappropriate.” Mark Twain wrote a story called the Man That Corrupted Hadleyburg. The story is about a town whose claim to fame is honesty. The town’s self image is so wedded to its reputation for honesy that for years the inhabitants are kept from all temptation lest they succumb and destroy the town’s reputation. In the minds of non-residents, it isn’t long before “arrogance” overtakes “honesty” as the town’s primary descriptor. Eventually the town’s arrogance offends a visitor who believes because the town’s citizens are seldom tempted, their honesty is only skin deep. Unknown to the other leading citizens, the offended stranger tempts each of the town’s stalwarts with a large sum of money. To a man, they succumb to the temptation. Dr. Schneider and his ilk remind me of the town’s leading citizens both in regards to arrogance and to the belief that non-exposure to nonsensical ideas is the path to salvation.
“The reason that we do not ask focus groups of farmers and auto workers to determine how to license airplane pilots and doctors is they have no skill at that. And we do not ask people with PhDs who are not climatologists to tell us whether climate science is right or wrong, because they have no skill at that, particularly when they’re hired by the fossil-fuel industry because of their PhDs to cast doubt. So here is where balance is actually false reporting.”
BALANCE is false reporting! Since when? Reporting is reporting. False reporting is in incorrectly describing real-world events either deliberately or inadvertently. If a group of people claims a secret society exists below the Earth’s surface, reporting what the group says may not serve a useful purpose, but it isn’t “false reporting”.
“But climate risks occur at the level of the planet, where there is no management other than agreements among willing countries.”
And what should we do if the number of “willing countries” is small? As much as I believe the political left is leading this country to ruin, I believe the principle of self-government trumps my personal beliefs. If the people want socialism, so be it. After all, when a doctor says you’ll die unless all of your limbs are amputated, the final decision is not the doctor’s, it’s yours.
“At least in the old days when we had a Fourth Estate that did get the other side—yes, they framed it in whether it was more or less likely to be true, the better ones did—at least everybody was hearing more than just their own opinion.”
Now I’m confused. I thought Dr. Scheider said it was “completely inappropriate” to give both sides of the “panel of three doctors” and the “president of the herbalist society” equal time. The question I’d like to ask Dr. Schneider, but now cannot, is “Who gets to decide what is appropriate for balanced reporting and what is inappropriate?”
“What we have to do is convince the bulk of the public, that amorphous middle.”
So I guess even Dr. Schneider believes the unwashed masses have some say in their own destiny.
“But now, given the new media business-driven model, where they fired most specialists and the only people left in the newsroom are general-assignment reporters who have to do a grown-up’s job, how are they going to be able to discern the north end of a southbound horse?”
In the case of some reporters as well as some scientists, it’s as simple as looking over your shoulder.

adamskirving

July 21, 2010 9:45 am

@ur momisugly Nuke
“Computer code is deterministic. Run the same code with the same data and the results are always the same. ”
I agree with the thrust of your post, but I would like to emphasise that Dave Springer is not being snarky. Hardware and firmware may have faults that mean the same code produces different results on different machines. Worse still there are certain bugs like ‘race conditions’ in code that sometimes cause programs run on the same machine to exhibit errors at apparently random intervals. That’s why I was taught to not only make my code available on request, but to keep a record of hardware used, operating system and version, compiler and version, and so on.

vukcevic

July 21, 2010 9:54 am

Re : Loehle’s reconstruction
The Loehle’s data from:
http://www.ncasi.org/programs/areas/climate/LoehleE&E2007.csv
appear to be perfectly reasonable. On the other hand my view may be biased.
http://www.vukcevic.talktalk.net/LFC1.htm
I would love to know what Dr. Loehle would have to say for the above ‘misuse’ of his data?

TomB

July 21, 2010 10:07 am

I think one of the great difficulties in releasing source code would be version control. It would be important to get version of the code actually used to run whatever result is being analyzed, not necessarily the current version. Given what I’ve seen in the ClimateGate code, I doubt that a good version control system was being employed.

Rod Smith

July 21, 2010 10:11 am

Computer Code: Many decades ago I was handed about 4000 lines of assembler language to debug. Labels were limited to six characters. The perhaps 4000 lines of code had two comments. The first line said, “On the first day, I coded.” The last said, “And on the seventh day I rested.”

DirkH

July 21, 2010 10:12 am

Nuke says:
July 21, 2010 at 7:38 am
“Computer code is deterministic.[…]”
Only if you do it right. Think parallel systems. The climatologists use multiprocessor machines, determinism is in this case a quality that must be engineered into the code, and by default a program would be indeterministic. Everybody who now says, oh, that’s basic, has never debugged a multithreaded or multiprocessing program.

Andrew30

July 21, 2010 10:13 am

Robert E. Phelan says: July 21, 2010 at 9:37 am
Q$=”FILE1FILE2FILE3FILE4FILE5″
FOR I=2 TO LEN(Q$)/5
CLOSE(1)
OPEN(1) Q$((((I-1)*5)+1),5)
READ (1) A$,B$,C$
GOSUB nnnn
NEXT I
REM There is are two bugs in this program. Where are they?
1. The initial value for Q1, ”FILE1FILE2FILE3FILE4FILE5″ appears to be a single literal value, which is likely an error Unless there is One input file called FILE1FILE2FILE3FILE4FILE5.
2. The statement FOR I=2 TO LEN(Q$)/5 will Not execute a loop since the length of Q1 is 1, from the initialization on the first line, and the initial value of I is 2, and therefore already exceeds 1/5.
3. The statement CLOSE(1) would have failed if the loop was executed (which it will not be (see: 2) since the file 1 is not yet open.
4. The evaluation of ((((I-1)*5)+1),5) will not be executed (see: 2) but if it was then the result value ((( 2 – 1) * 5) + 1) as a first index on a two-dimensional array with a value 6 is out of range for the contents of Q (length = 1 (and it is a single dimensional array)) and the use of a second index of 5 on a single dimensional array (if not caught by the compiler) will also be out of range. Both or either would result in a run-time memory violation.
5. After the read there is no test to check is any values where assigned to A, B or C, then are therefore undefined.
6. When run the program will do nothing very quickly (see: 2)
That’s Six, not Two; now where is nnnn, and does this language scope variables such that they are visible to nnnn 🙂 ?

George E. Smith

July 21, 2010 10:15 am

Seems to me that this “code review” business has a simple solution.
Back in the days where science could be done on the back of an envelope; people wrote down specific mathematical equations that defined precisely how data was to be processed to “show aconnection”.
Planck’s Radiation Law for Black Body Radiation, is NOT wishy washy. It is mathematically specific and nothing empirical enters into it. All parameters are dervived from fundamental Physical constants whose values are know to estremely high precision; and publishing that equation tells any user how to make use of the information.
In the field of Patents, we have a situation, where an “author” also known in this case as an “inventor” publishes for all to see; sufficient information about his previously secret invention to enable ANYONE (having ordinary skill in the art of the invention) to replicate the invention and to use the information to build on for his own purposes; and patent law requires that such teaching takes place in the patent that is finally approved.
IN RETURN for disclosing to ALL how to practice the invention or make use of it, the “author/inventor” is granted exclusive rights to the use and benefits of the invention; and to license the invention to others if (s)he chooses and for monetary gain.
And in most International Patent law; the first to publish gets the rights whether he is the inventor or not.
So why not use the same system for “Scientific Inventions”.
You want to get credit for some scientific breakthrough or discovery; you publish your paper that teaches ANYONE (having ordinary skill in the art) how to use/replicate/expand on your discovery. That could include releasing the code that turned your raw observations into your published results.
Well of course if you don’t teach others how to “do it” and somebody else comes along; and DOES publish code or whatever that shows how to manipulate your data to replicate your results; then of course (S)HE should be the person who gets the academic recognition and credit for the results.
Lots of companies choose to hold key information they obtain as a “Trade Secret”; like the recipe for Coca Cola; rather than Patent it. They are gambling that they can protect their secret from discovery by someone else. But if somebody quite independently should discover the exact same concoction and publish it for all to see; then CC would be screwed; and the re-discoverer would derive the benefit (or other consequence) of his discovery.
So there you have it. If you want credit for some new knowledge; then disclose ALL that is necessary to use; and/or replicate it.
Or keep it to yourself as a “Trade Secret”.
When I do a lens design (for my boss); I really don’t design a lens. I design a “Merit Function” which describes in some formal way what I PERSONALLY consider to be a good lens design for whatever the end purpose is.
The off the shelf optimisation software simply manipulates whatever variables I supplied, until it minimises the value of that Merit Function. The software could care less about lenses or lens design; it simply follows whatever algorithms its authors put into it; to find some minimum value for the functiuon that I claimed constitutes a good lens.
Whenever I send such a design out to be manufactured; the whole design file goes out with it so others can evaluate the performance to see if the lens meets its goals. They can do any kind of evaluation; make any kind of changes or do whatever they like with my design.
The one thing they do not get, is the merit function; which after all is nothing more than my opinion of what a good lens is for the prescribed purpose. Others may have a different description of what is a good lens; and they are free to make up their own description and modify my design as they choose; either improve it if they have a better idea than mine; or royally screw it up, if they don’t. My boss; and our competitors can easily prove for themselves how my design performs; although the competitor may have to buy a product, and reverse engineer my lens.
They just can’t get into my head to find out what I know about lens design. And yes the company archives do have every single thing I have ever done and tried safely stored; so if I get hit by a truck this afternoon; then they finally can learn to appreciate how much I really do know about lens design. They paid for it; it belongs to them.
So if your work for which you expect credit requires computer codes to teach others about your work; if that was the intention; then you should publish the code too; if you want the credit; before somebody else does.

hotrod ( Larry L )

July 21, 2010 10:21 am

Patrick M. says:
July 21, 2010 at 2:40 am
toby said:
“The code you write yourself when doing a paper is not for commercial use and not user-friendly. It is usually idiosyncratic and uncommented (or at least poorly commented). Giving it to somebody implies spending valuable time explaining all the wrinkles as well – a waste of time with someone who should be able to do the job themselves.”
As a professional software developer I can state that if your code is “idiosyncratic and uncommented (or at least poorly commented)”, it is also likely to have bugs. That’s a major issue with your statement above, you are assuming that there are no bugs in the original code. So if somebody tries to replicate your results and fails because the original has bugs, where does that leave us? Obviously, if the original programmer didn’t have time to comment their code then they’re not going to have time to review the code of someone who is trying to replicate their results. So then what?

Good points. In the analogy of computer code to a mathematical expression in a paper you have a good example of the crux of the problem. If a person writes a paper that includes a formula 2+2=5, it can quickly and easily be tested by the reader and found to be in error. If they write a description of the process then it gets a bit more complicated and harder to verify.
Even if the author honestly attempts to write a correct description of the manipulations the code performs with some sort of pseudo code and commenting, that still does not necessarily mean his methods can be accurately duplicated.
For those that work in IT and have spent hours or days chasing down an obscure bug that only shows up when a specific combination of inputs are used but works flawlessly on common inputs knows, the code does not always do what you think it does. This is the heart of the matter.
Just because you tell me you perform steps x y and z, with inputs a b and c, does not necessarily mean that your code actually does that in all cases. For example it might add x + y, and multiply by 2*z if a and b are less that 5*c but if they sum to a value greater than 5*c they might actually add x+y+c and multiply by 2*(z-c).
No reviewer could find that sort of error by writing their own code unless they exactly duplicated the coding error the author used.
Computer code output can only be accurately duplicated, and its output verified by testing the exact same source code complied by the exact same compiler on the same hardware.
How many people remember the floating point bug in early PC’s?
It was on the Intel P5 Pentium floating point unit (FPU).
(note wiki reference)
http://en.wikipedia.org/wiki/Pentium_FDIV_bug
A friend of mine worked in a tax assessment office where they did calculations out to something like 6 or 8 decimal places to eliminate rounding errors. The code that worked perfectly on their older pc’s suddenly started having errors when they upgraded to the newer computers because of the then unknown floating point calculation error in the cpu’s.
http://www.intel.com/standards/floatingpoint.pdf
http://cache-www.intel.com/cd/00/00/33/01/330130_330130.pdf
Researchers need to move past “idiosyncratic and poorly documented code” for research that could be used as the basis for trillion dollar decisions that will effect billions of people. Even professional code crunchers at Microsoft and other large players in the IT industry routinely crank out buggy software despite years of testing and software control measures.
Everyone knows from personal experience that commercial software written by professional coders, and validated through formal beta testing programs always contains bugs that take years to find when used by millions of every day users.
The presumption that one off research code is bug free is absolutely absurd!
The presumption should be that it is riddled with bugs, and validation and replication of research results should operate on that assumption. That means that the only legitimate replication of research results would be one that includes outside audit of the research code and its behavior under all imaginable inputs and conditions.
Software containing large line counts are statistically far more likely to be buggy than they are to be correct. Even if you find a bug, fixing it might introduce yet another more serious bug. In thousand or million line computer code, bugs are unavoidable.
It is safe to say, that it is an absolute certainty that all the existing climate model codes contain literally hundreds if not thousands of bugs in the code. When you understand that much of this code is written not by professional programmers but by professionals in another field that happen to code software to assist their research, the likelihood of code errors is probably higher than you would expect in professional commercial software which as we all know is almost certain to contain multiple bugs, many not showing up until the software package has been in widespread use for years by millions of users.
http://www.guardian.co.uk/technology/2006/may/25/insideit.guardianweeklytechnologysection
http://www.nist.gov/director/planning/loader.cfm?csModule=security/getfile&pageid=53212
Larry

John Whitman

July 21, 2010 10:26 am

Not being computer programmer oriented, I have a question.
Aren’t there computer software industry standards like similar to mechanical engineering’s ASME, AWS, ANSI? If so, do any standards govern the areas of computer software development/control?
John

Jim

July 21, 2010 10:30 am

Giving up the data but not the code is rather like a chemist supplying the data from his or her measurements in a experiment, but not describing the apparatus. How could you have confidence in Millikan’s oil drop experiment or even understand the numbers without knowing the apparatus and how it worked. The measurements are rendered meaningless. As for obscure code, all the programming languages I use provide for comments. Comments are essential because if you look at your code a year from now, even you might not understand what you were trying to do. Sloppy coding practices is no excuse and could be looked upon as unprofessional. I know if I don’t include comments, I get that tag!

Quinn the Eskimo

July 21, 2010 10:35 am

sphaerica says:
July 21, 2010 at 6:07 am
The obligation to disclose data and methods depends on the context, which is my point. Commercial science is under no obligation to disclose data, methods or code, and intellectual property law exists to protect, reward and encourage such invention. National security related research also must be kept secret.
Climate science, the basis of the ultra-radical demand to decarbonize the economy to forestall doom and catastrophe, stands on a different footing. The notion that work in this area should be protected intellectual property while at the same time serving as the foundation of ultra-radical public policy is ludicrous and indefensible.
Further, the norms of purely academic research are fundamentally irreconcilable with the claim of protected intellectual property if disclosure is necessary to replication/falsification. Three separate British Royal scientific societies have said so explicitly in commenting on Climategate.
Finally, if the CRU boys had disclosed their code, they would have disclosed the many gems in the Harry_Read_Me.txt file, such as this one:

printf,1,’IMPORTANT NOTE:’
printf,1,’The data after 1960 should not be used. The tree-ring density’
printf,1,’records tend to show a decline after 1960 relative to the summer’
printf,1,’temperature in many high-latitude locations. In this data set’
printf,1,’this “decline” has been artificially removed in an ad-hoc way, and’
printf,1,’this means that data after 1960 no longer represent tree-ring
printf,1,’density variations, but have been modified to look more like the
printf,1,’observed temperatures.’

That’s the “intellectual property” the CRU boys were hiding.

adamskirving

July 21, 2010 10:50 am

@hotrod
The floating point erreor wasn’t in the P5 but the P1, i.e. the fifth generation of the X86 hence the name Pentium (5). /Pedant off

sphaerica

July 21, 2010 10:50 am

Anthony Watts says:
July 21, 2010 at 8:45 am
sphaerica says:
July 21, 2010 at 6:07 am
People are entitled to their own intellectual property, and they’re entitled to try to be “the one” to publish that next great ground breaking study. And if some of the information from their last paper is serving as the foundation for the next, then no, they do not and should not have to share it.
=============================
I assume then that you will complain on my behalf, loudly, to Mr. Menne and to Director Karl of NCDC who “borrowed” my data from the surfacestations project when it was 43% complete, over my written objections, and over the objections of my co-authors, ignoring us all and revoking all professional courtesy in order to preempt the paper we are now finishing with the data at 87%?
Seems to me that “climate science” gets a free pass when it suits them.
I look forward to seeing your signed complaint letter to them.

I’ll take it from this post that you agree with my position.

REPLY: I take it from your reply you won’t defend me being abused in exactly the situation you describe. – Anthony

sphaerica

July 21, 2010 10:58 am

Smokey, Chris Long, Scott B., vigilantfish, and all the others…
You still don’t understand how science works. Publishing an exact recipe for how to prove something so that someone else can repeat the exact same process is not going to get anyone anywhere. That’s not how it’s done, or how it should be done.
And you won’t get any traction with me by using McIntyre as any sort of example. If anything, his inclusion in an argument is further evidence of how little you understand how science works. Hint: it’s not engineering, and most of the comments posted here fail to understand the difference.
And for Pamela Gray, Nuke, Tallbloke, and other angry posters… sorry, I’m not biting. Spew your hatred, innuendo and evil mad scientist conspiracies all you want. I’m not interested, because it has no place in the real world. It’s not even worth discussing.

Robert E. Phelan

Editor

July 21, 2010 11:17 am

Andrew30 says: July 21, 2010 at 10:13 am
Good try Andrew, but in Business BASIC, an antique language I spent lots of years working with… too many, probably, when supplied with line numbers, that code fragment will execute.
Q$ is in fact a literal value, or, as we called it, a string. Most of the funky code is trying to parse Q$, a fairly common technique when opening and closing a number of files. the function LEN(Q$) returns the length of the string, in this case 25…. dividing that length gives me 5 so the loop in this case is FOR I=2 TO 5. The advantage here is that I can add another file name to my string and not have to change anything else in the code.
In some versions a CLOSE statement may generate an error, but often the default is to simply fall through to the next line. If you want special handling, you can do CLOSE(1, ERR=nnnn)
The “nnnn” notation was meant to suggest a line number for generic purposes. Putting a real line number in, even one that doesn’t exist, will result in the program defaulting to the line after that line number. If there is no subsequent line, the program ends as if it had hit an “END” statement.
The two errors are in the parsing of Q$ in the OPEN statement. It will OPEN and READ files 2, 3 and 4, but not 1 and 5. If I were working with a series of identical files containing data for a given year and my GOSUB was populating an array prior to generating say, a trend, the output might look just fine, especially if I had lots of files.
Context is everything. By the way, which language were your comments based on?